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INTRODUCTION 

In Ms first paper on wave mectianics Schroedinger^ presented his 

famous equation in the form of a variation principle, indeed just the 

variation principle which we will be discussing in the next section. 

Thus our subject had deep roots in quantum mechanics, and of course, 

2 

the general use of variation principles goes back much further. 

Similarly the variation method, the general approximation method based 

on the variation principle, which we will be discussing in detail in 

subsequent sections, is one of the pillars of applied quantum mechanics 

since most approximation procedures are either direct applications of 

the variation method, or can be related to it in one way or another 

(and of course the use of variation methods to approximate the solution 

of physical problems has an even longer history) . 

Finally, if our choice of subject is in need of further justifica- 

3 

tlon, let us note that in recent years Ruedenberg has shown that by 
.taking the variation principle rather literally and imagining that as a 
molecule forms it actually does, so to speak, try one wave function 

I 

and then another, relaxing a bit here, tightening a bit there, before 
finding the most suitable wave function, one can get real insight 
into the nature of chemical binding . 
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I. POSITIVE HERMIIIAN OPERATORS 

A positive Hermitian operator is a Hermitian operatbr T«^ose eigen- 
values are all non-negative. For our purposes the most important 
consequence o£ this is that the expectation value of such, an operator 
is always non-negative. Proof: Let 0 be a positive operator, 5^ 

a complete orthonormal set of its eigenfunctions^and ■ its 

eigenvalues. Then, using a discrete notation, (' O ^ 

21 I C .which is "jy Q- ^.since by assximption the 

are o . 

We will also have use for the following trivial extension of these 
ideas : Suppose that although O is not positive, we deal only 

with 's such that " D unless 7j 0 . Then 

clearly we still have O O for ail such 

We w;ill .say that such an O' is positive with respect to the 

functions ^ 

II. THE VARIATION PRINCIPLE 

Given any function vl/ for which the requisite integrals 

exist (we will refer to such functions as ’’trial functions”) we can 
calculate the real number 




(II-l) 




Evidently g would be the average energy of the system if the system 

^ B 

were in the -state described by the function . Similarly if 

v|..7 is another trial function we can calculate the corresponding 
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average energy 


£ 2 1 $^) / ^ 4 - :> ■?^) 


(II-2) 


^ X 

One property of E and ^ is clear immediately; since 
each is an average energy, neither can be less than the smallest pos- 
sible energy, that is neither can be smaller than the smallest eigenvalue 

of .We will return to this point in a moment. To derive 

r«i.? X ^ , 

other properties of G and H we now write as 


-3. V ^ 


(II-3) 




thereby defining ^ 


Then using (1) and (2) we find 


X « 


rvf 

£ 


E Cvp, 4-} ^ H V CS) IfKj) 

iW7W)~ 

However from (3) ^ ^ - Cg,<h) 

so we can’write (4) as 

E + Cai cvj-l.) -t-r&'Cvi-S-Ny) . 

j rvy rs^ . . ' / / , , 

Finally we use the Hermiticity of CV4— ts find 


(II-4) 


e -V- S) („_5) 

We will now draw several important conclusions from this result. 

[1] Suppose that 

I) = o 

•X ^ 

that is, suppose that >4^ and & are an eigenfunction and 
the corresponding eigenvalue of ^ . Then (S) becomes 
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£ =• fe 




(n-7) 




is small so 

differs 


which tells us, among other things, that when i\ 

^ X rK/ 

that is nearly the eigenfunction 'I' , then ^ 

X 

from the corresponding eigenvalue £ by terms which are at least 
of second order in ^ . Therefore the eigenvalues of H are 




stationary .points of 


as a functional of 


[2] We will now show that 


has no other stationary points. 




Thus suppose that 


is a stationary point of E as a functional 




of 'I'* 


This then requires that the first order term in (5) 


C "i- C A ; 'f) 




must vanish for all sufficiently small and hence in particur 


lar must vanish for 


K 


where 


A = e c^~fe) H- 

is an arbitrarily small real number. Thus we must have 


which can be satisfied only if 

I) ^ 

Therefore we have the result that if £ is a stationary point 

X _ X 

then ' & is an eigenvalue and the corresponding is an eigen- 

function. The characterization of the eigenvalues and eigenfunctions 
of provided by [1] and [2] constitutes a statement of the 

variation principle . 

[3] Suppose now than ^ is the smallest eigenvalue of ^ 

Then CV^ ■='&</ has only non-negative eigenvalues and thus is a 
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positive -operator . In particular' then this means 'that however large 
/\ may be 

(A, CH-E7 A) 7/0 

and hence, from (7) we see, as we.noted at -the outset, .that the -lowest 

eigenvalue of Is the absolute minimum' of as a "functional 

_ X. 


of 


D 


On ’the other hand if 


is not the smallest -.eigenvalue 




then by choosing ^ to be an .arbitrarily -small linear combination 
of the lower (higher) eigenfunctions we . can .make C \) ) less 

.than .(-greater than) zero. Thus the higher -eigenvalues of .Vi are 
only stationary points of &. -as -a -functio-nal ‘of r and 

.are neither maxima or minima. 

.[4] That the .lowest eigenvalue; is an .absolute, minimum of ^ is 
a very striking, result . However it -does not -in ^general .serve to -char- 
acterize -.the energies of .the .ground states :of --;atoms .or molecules Since, 
because :of the requirements .of the 'Eauli Erinciple, .these ..g-round states 

are usually not the lowest estates of -the.-Hamlltdriian; for example the 

2 3 

ground state .of the lithium atom .is (Is,) -2s .andrnot (1-s)' . 

Happily however there is -a similar theorem which .is -applicable to 
physical ground states and to various excited states as well. 'Namely 
suppose '.that commutes with .certain .commuting .operators 

.so that we can form a complete .orthonormal set of eigenfunctions of 

which are also -eigenfunctions of. A , i.e., can be labelled 
by the eigenvalues of y) . We will say .that functions having -the 
9!-' quantum numbers have the same symmetry, and we will say 


same 




that the two functions which have different >7 quantum, numbers, 

and which are therefore orthogonal to. one. another, have different 
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symmetry. (We will have no need to compare functions associated 
with different sets .) If then ■£ is the smallest eigen- 

value of ^ for states of a given symmetry (for example for states 
satisfying the Pauli Principle) it follows that CW — '^3 will be 
a positive operator with respect to functions of that symmetry because 
clearly such functions will be orthogonal to all the lower eigenfunctions 




If now we confine attention to . with the given symmetry, 

r\^ X 

then /X ^ ' will also have that symmetry and therefore 


however large /X may be, still 

Thus we have the result that the lowest eigenvalue of 

r\j 

with a given symmetry is the absolute minimum of, 


associated 


as a'-functional 


of trial functions of that. symmetry . 

[5] We now note that if u commutes with T and if 

X 

has a definite symmetry then the variation which 

played the decisive role in [2] will have that same symmetry. Thus 

we may generalize the result found there as follows: If l-\ com- 

mutes with and if x has a certain symmetry, and if tS. 

is stationary with respect to all variations of that symmetry, then 

% P 

4" is an eigenfunction and ^ is the associated eigenvalue. 
In short, combining this last result with [1], the variation principle 


applxes separately to each symmetr'v 


[6] As a generalization of [4] x-7e have the following: Let £ 

be an arbitrary eigenvalue of H and confine attention to V 
which are orthogonal to all eigenfunctions of H whose associated 
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eigenvalues are less than £ . Then clearly we will have 

CKj 

and thus an arbitrary eigenvalue o£ H is an absolute minimum, of 
H as a functional of trial functions orthogonal to eigenfunctions 
associated with smaller eigenvalue's . 


III. THE VARIATION METHOD 


The results of the previous section are of great practical impor- 
tance because they suggest a soundly based method for approximating the 
eigenvalues and eigenfunctions of W . ' According to the variation 
principle we can find the eigenvalues and eigenfunctions of V\ ' by 
calculating ^ for all , and then looking for stationary points. 

In practice this is usually impossible - one cannot examine all 
However what one can do is to examine a restricted 'class 'of trial 

functions, a class no larger than one can handle, and then take the 

r\i 

stationary points of fe. within this restricted class as approximations 
to the eigenfunctions and eigenvalues of ' H . 

This procedure is known as the variation method . We will call the 
Mr* which yield stationary values, optim'l trial functions and denote 
them by ^ , possibly with a subscript. The corresponding ^ we 

will denote by H , again possibly with a subscript. 

We said that this is a soundly based method. To support this as- 
sertion consider first the lowfeA't state of a given symmetry. Then [4] 
of Section I tells us that we have a good approximation scheme in that 
it is capable of systematic improvement. Namely if we enlarge the class 

of trial functions (assumed to be of appropriate symmetry)’ -then the 
A 

H will almost certainljf decrease ( in any case it 


minimum 
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cannot increase) whence, from [4] we will have a better approximation 
to the energy. Note also that we have a quadratic convergence to the 
eigenvalue in the senpe chat, as follows from [1], the 'error in the 
eigenvalue is of second. order in the error of the eigenfunction. 

For the higher states of a given symmetry the situation at this 
point is not so clear. Result [6] of Section I is of little practical 
use since one usually cannot guarantee the required orthogonality. 

We can of course say that if we enlarge the class of trial .functions 
we will make the higher E "more stationary", but this may or 
may not represent a numerical improvement. However in a later section 
we will discuss a practical way of choosing trial functions (the linear 
variation method) which does permit a systematic improvement in the 
approximation to higher eigenvalues. 

Even from these brief remarks it should be clear that the varia- 
tional approximation to eigenvalues a soundly based one. It is 
harder to make a definite statement about the quality of the eigen- 
function approximation, mainly because there are so many figures of 
merit which one might use — the overlap between the approximation and 
the eigenfunction, the accuracy of particular expectation values , the 
energy variance etc. etc. We will not 

attempt a quantitative dis'cussion of these many possibilities but 
we will return to these questions from time to time in the succeeding 
sections. 

In a general way however one usually says that the approximation 
- - 

to the .eigenvalue furnished- by E is better than the approximation 
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A 

to the eigenfunction furnished ‘by because^ as we have already 

noted, the error in the former is of second order in the error in the 
latter. In this connection though it should be kept .in mind that to 
some extent "order" is a theoretical concept, and that second order 
quantities are guaranteed to be smaller than first order quantities 
only if the order parameter is "sufficiently" small. Thus A is 

less than X only for '/a . 

Also it should be admitted that the preceding discussion of energy 
is directly relevant only for very light systems since usually' it is only 
in such cases. -that total energies are of Immediate interest. Rather one 
Is usually interested in comparatively small energy differences,' exci- 
tation energies, ionisation energies, changes in molecular energy with 
nuclear configuration, etc. ' Therefore since the difference of two 
upper bounds is not in general, a bound, and since improving the indivi- 
dual upper bounds will not necessarily Improve the difference, improve- 
ment of the accuracy of the total energy is not of immediate concern."^ 

(Of course the difference between an upper bound and a lower bound is 
an upper bound but that is another story) . These considerations might 
then lead one to a pessimistic view of the applicability of the vari- 
ation method to atoms and molecules . However in practice the opposite 
situation prevails - differencing of 'results of only moderate 'individual 
a'ccuracy. often giving results of even very high accuracy. In some 'cases 

this 'can be understood 'as; a cancellation ‘of .obviohs common'’ errcrrs:/' but 

2 

in other .-cases, for example in recent calculation on He2» the process 
is by no means well understood. 
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IV. THE VARIATION METHOD; -MORE DETAILS 


In general the set of trial functions will be labelled by arbitrary 

numerical parameters and/or arbitrary functions. To implement the 

variation method then what one does in principle, and often inpractice, 

r\/ 

is to calculate £ as a function of the variational parameters 
and/or functions and then determine their optimal values by setting 
equal to zero the derivatives of & with respect to each parameter 
and function. This approach, when carried out formally, yields a set 
of equations which must be solved to determine the optimal values of 
the variational parameters and/or functions. We should however point 
out that in many practical calculations which must be largely numerical 
rather than analytical, such equations are often partially or completely 
bypassed in favor of some sort of direct numerical search procedure 
to locate the stationary points of ^ 

For theoretical purposes, and sometimes also for practical pur- 
poses, it is, however, convenient to proceed a little more indirectly 
in the formal discussions. Starting from a given , suppose 

that we change the parameters and/or functions infinitesimally in some 
way so that we go from to a "neighboring" function. If we 

denote the first order change in , the variation in , 

by O Y , and if we write (ll-l) as 


C'+, tw-t'') '+ I =- o 


civ-l) 


(\) rs/ 

then we see that 0& , the first order change in G , is deter- 


mined by 
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^ —fee C'?, '^') ^ o (iv-2) 

A /v> 

Now the are those which make ^ stationary with 

respect to all variations possible within 'the set. Thus we must have 

where the quotation marks are to remind us that we are requiring that 
equation (3) hold only for variations within the set. We will some- 
times write (3) as 


d C 0 ■ (IV-4) 

without explicitly stating the qualifications and "all" 

bi • 

Equation (3) together with 




(IV-5) 


/s 

are then the equations to be used to determine the and 

and these are the equations which we will use to characterize the 
variation method. An obvious procedure at this point would be to 

eliminate K from (3) by means of (5) , solve the resultant equa- 

A A 

tions for 'V and then return to (5) to determine ; and 

indeed this just leads to the straightforward procedure which we outlined 

at the beginning of this section. To see this, and also to make our 

notation a bit clearer, suppose for example that the set of trial 

functions is labelled by a set of N (independent) real parameters- 
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S, j is 

For example for a single particle we might use •Z.’") 

Then evidently the most general & is 

IM 

L~i &(iv 

with the (5^v real but otherwise arbitrary. Inserting this into 
(3) we then have 


IT [ ( I cu 'j + ^ o*_e) ■ aj’f&j o (iv,6) 


which, we will now show implies that 






Proof: That conditions (7) are sufficient to satisfy (6) is obvious. 

That they are necessary follows from the observation that since the 
are arbitrary we can choose =:. q for Z -4^ and 

S><^J'4’0 • Then (6) yields (7) directly. 


Let us now calculate 




From (5) we have 


^ 1 . C<^tS;3,qc^ia7} _ ^ ^ ^ ^ ^ -\ 


Cc^Ca7, ^cS.) 
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A 

P\Vt^T L\w£ £ 


and hence finally 







^\CU 


A 


5^3 


)] 


^c&i) 


Comparing this result with (7) we see, that as we stated earlier, 
equation (7) with (5) are equivalent to > 

A 

rr “ 0 ' 1 --- M-, (iv-8) 

^ 0-5 


We have implied that the use of equation (3) offers certain 
advantages over the use of equation (8) . As a first illustration of 
this point we now remark that often (5) is a special case of (3) , so 
that (3) alone then suffices to completely characterize the variation 
method. Namely frequently (the linear variation method T<hich we will 
discuss in detail in subsequent sections is an important case in point) 
a set of trial functions will have no fixed overall scale, that is if 

Aj 

S' is a member of the set then so is A 4^ where A is an 


A • 

arbitrary constant . In such cases then, among the neighbors of 
in the set will be Cl"^SA) where ^ is a small real constant. 


A A 

which in turn Implies that (3) must be satisfied by §A- '4' 

Inserting this into (3) and cancelling a factor of 5 A then 

yields (5) which proves the point. 
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V. THE VARIATION METHOD AND MOMENTS OF IHE SCHRODIHGER EQUATION 

Using the Hermiticity of (H-E) , Eq. (IV-3) can be written 


as 


(S'!*-, w-fe} 4 ) -t- 

or 


^ C§4:,tvi-i>4') =D ^ 

We now note that if, as is often the case in practice, there are no 
a priori reality conditions on ,the variational parameters and/or 
functions then the equations (1) are equivalent to the seemingly 
stronger equations 


(5^, Cvt-^) (v-3) 


To see this let us suppose that ‘the ^ are labelled by a single ' 
arbitrary function % , the generalization to several functions 
and/or parameters then being obvious. Then in general 


8$^ aj; 8$ 

so that (1) becomes 




where, of course, there is no need to qualify "all". That there are 
no a priori reality restrictions on *X means that we can vary its 




(V-4) 


real and imaginary parts separately. Therefore we 'may replace (4) by 
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S CM-e)'?-] It 4 ^ fef? 


je. ^ <>fil (^Tt))fi. (V-5) 


and 


( +- (c»-g)’t,'l|(4‘i;j'A ^ ^ £S^)j. (v-6) 


where 


in\ and ($>?: 


X are- the real and imaginary parts of 
and where in (6) we have cancelled out a factor of i. We 
now derive (3) as follows: (6) is to be true for all CS'V-^^. and 

therefore in particular, for a given (5il) , (6) must be true 

with ^ CS'y-^R. • Making this substitution in (6) and 

comparing with (5) then immediately yields 




(V-7) 


Similarly for a given , (5) must be satisfied with 


which then leads to 


t H ^ Ui)x 


-V7 V. / 'f'J-V /■/X OJ-8) 

If now we multiply (8) by (-i) and add to (7) then the result is (3). 

Having gone through this in detail it is now useful to note the 

following quick derivation: If there are no a priori reality restrictions 

A 


C * f 

then if '4’ is a possible variation of 4^ 
• Sa.’tr " "kKa-vv so i-S 5*.^ * 4'- - 


within the 
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(Proof: Choose ( )|j__ ^ ). Therefore 

(1) must also be satisfied if we replace by t.6^ 

t 

Doing this and cancelling a factor of then yields 

-(S'!?, 4^!) Ccw-4)4-| & y ) (v-9) 

Comparison with (1) then yields (3) . Finally we note that (1) and (3) 
are trivially equivalent if is explicitly real and if one re- 

stricts oneself to real trial functions. 

When (3) applies it provides an interesting and suggestive inter- 
pretation of the variation method. In a general way, given a function 
F , quantities of the form 

(G, F) 

for various choices of G, are referred to as "Moments” of F . Thus we 
can say that when (3) applies, the variation method approximates making 
“=- O by requiring the vanishing of a restricted set 

of moments of (Note that the other basic equation, 

(vf, CH~ &■) 4') -=1 0 , is also in moment form) . 

The approximation of requiring only that certain moments of 

. A 

vanish, is certainly one which one. might come upon, and indeed 
one which people have come upon, without reference to the variation 
method. In particular consider the linear variation method (which we 
will discuss in more detail in succeeding sectfLons) in which the set 
of trial functions consists of functions of the form 
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21 








(V-10) 


■where the (the "basis set")' are a given set of linearly in- 

CVv 

dependent functions, and where the 'Ctt, are- arbitrary parameters . 

(v< ■ ■ 

■ If no reality conditions are imposed on the CX , -then (3) 

applies so that with • ' ’ 




A 


4.0. Z 


; Z 5^^ 4 > 

L-=.) 


(V-11) 


we have 


Z Z C4h, =p ^ 

K-S.J La.} ' ■ ' - 


and therefore (recall the proof following (lV-7) 




~2Z On^ rs^x) \ 


(V-12) 


Now the point we want. to make is that one can arrive at these same 
equations, and people often do, by first writing, doi-ra the "Schrodinger 
Equation" (the reason for the " " will be discussed in a moment) : 

and then simply taking the scalar product with each in turn. 

This sort of approach to the derivation of .equations (11) suggests 
other possibilities. Since the use of has special reference 

to the variation method let us consider the mo.re^ neutral "equation" 
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(^_S) Z -^0 

U-=>\ 


(V-13) 


Then we note that although. the procedure of ’’talcing the scalar " 

provides one way of trying to determine t and the , 

there are other possibilities. For example one might try to satisfy 
(13) identically at M selected points, or more generally one might 
try multiplying through by quite another set, of M. functions and 
integrate to- find 

21 Jrt. (v- 14 ) 


[Evidently this reduces to the second suggestion if the are 

Dirac delta functions] . These observations of course raise questions 
as to the relative status of these various approaches. Are they equi- 
valent? Is one superior to the other? First as to the equivalence: 

In general' the different procedures (different choices for the set of 

) will lead to different answers. The point is simply that (13) 
as it stands is almost "certainly an inconsistent equation - there are 

y ' V- , , 

no and t which satisfy it (hence our use of ” ") , or 

more precisely, it is a consistent equation only if there happens to 
be an eigenfunction of H which can be written as a linear combination 
of the . Since in practice in a complicated problem this is 

unlikely, we may take it that ’’equation" (13) is not consistent and, 
hence it follows that different methods of "solution" will in general 
lead to different: results . 
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Npw as to the advantages of one method over another. As we have ■ 
seen the variation method leads to (2) and thereforej as we know, this 

A 

endows it with the virtue that the lowest ^ is a guaranteed upper 

bound to the lowest eigenvalue of H of appropriate symmetry. Indeed, 

as we shall see in the next section, it is even more virtuous: the 

A 

^ which are solutions of (2) are, in order, guaranteed upper hounds 
to the M lowest eigenvalues of H of appropriate symmetry. Thus 
there is considerable reason to choose (2) . However recently there 
has been a revival of interest in the use of equations of the form 


4") =-o 


(V-15) 


V 

where may be of the form (10) , but may also be of a much more 

complicated structure, and where the may be given functions, or 

given functions multiplying operators, or may involve some of the 
arbitrary parameters and/or functions in *V which are to be deter- 
mined from equations (15). In any case the reason -for the Interest is 

A 

quite simply that with the forms of . which are in use (or Which 

one would like to use) in the applications of (3) to atoms and molecules, 
the integrals in (13) are often quite difficult (or impossible in 
practice) whereas with a ^ of similar form and with a suitable 
choice of the , the integrals in (15) are quite tractable. 

We will not discuss such methods further here but Instead will refer 
the interested reader to the original literature."^ We would emphasize 

I ' 

however, that such methods do not in general yield' bounds. 
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VI. THE LINEAR VARIATION METHOD 

Let us now return to Eq. (V-12) . This is a set of linear homo- 
geneous equations to determine the • It has non— trivial 

A ^ 

solutions (that is not all ) only for certain values of t- , 

those for which the determinant of coefficients (the "secular determinr 
ant"), vanishes 


j C •=. p 


(VI-1) 


Equation (1), the "secular equation", is an M'th order algebraic 

A 

equation to determine H ; and incidently note that in accordance 
with the discussion at the end of Sec. IV, we have not had to invoke 


(IV-r5) explicitly since the set of which we are using clearly 

has no fixed overall scale. We will denote the roots of (1) by 


with jA. 


and cq i ^ • — . Similarly we will 


4 A 

denote the corresponding by . 

The set of trial functions (V-10) has the special property of 
forming a linear space (a subspace of Hilbert space) since any linear 
combination of such trial functions is again a member of the set. In 
contrast the set of functions ^ 


with ^ 


a variational 




parameter do not form a linear space since for example € 't ® 

St tC^ 

is not of the form . There are other interesting sets 

of trial functions which form linear spaces. Thus the set of all 


functions of a given symmetry form a linear space. Also there has been 
considerable interest in the so called "S - limit" for Helium-like ions 
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* Nf 

in which one deals with all functions of the form 'Iv) where 

and are distances of the two electrons from the nucleus . 

Clearly the set of all such functions ^ or indeed the subset of all 

such functions which are symmetric (antisymmetric) form a linear .space. ^ 

We will now show that ;whenever the set of trial functions forms 

^ A 

a linear space, then although the and (we will use the 

same notation for the general case as for the special case (V-10^ are 
generally only approximations to the eigenfunctions and eigenvalues 
of H , they are exact eigenfunctions and eigenvalues of the "projected 
Hamiltonian" 


H ^ TT 'if 


(VI-2) 


where 


is the Hermitian projection operator onto the linear 


space spanned by the trial functions: 

r=L. ^ ‘F- 4* is in the space 

TT 4* — 0 vp. is orthogonal to the space 

Proof: From [1] and. [2] of Sec. II the conditions 


-V C4w , tw Aj-=-C? cvAfi A (VI-5) 

are both necessary and sufficient for 'V'k and to be an 

eigenfunction and corresponding eigenvalue respectively of ^ 

Now any A, can be written 


Aix V A 


(VI-6) 
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where A- Vi is wholly in the linear space and where Ax is 

orthogonal to the space. In particular then, this means that ‘^1. is 
orthogonal to C\^ ~ 5 therefore that the contribu- 
tion of to the left hand side of (5) vanishes identically. 

Thus we are now left with showing that 

C -V- Cw- (VI-7) 

We now note that since 

T and 

we may replace H in (7) by H which yields 


^ 4- (4^ CV5--6W) CixO ^ ^ 


(VI-8) 


But now we are finished because we know that (8) true, since it 

is just, the basic equation ^of the variation method written in somewhat 

different notation. Namely since we are dealing with a linear space 

A 

it should be clear that the functions which are 'close to v^t .5 can 
all be written in the form where A« is an arbitrary 

member of the space and where ^ is a small real number. Thus 

a general SM'k is of the form fr and this, when inserted 

in (lV-3) yields ( 8 ) which proves the point. Of course any function 
orthogonal to the space is also an eigenfunction of H with eigen- 


value zero. 
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A A 

The observation that for a linear space the and are 

eigenfunctions and corresponding eigenvalues of the Hermitian operator 

^ immediately leads to the following important results which among 

A ^ 

other things show that the and have some of the formal 

properties of the eigenfunctions and eigenvalues of M (these results 
can also be derived directly from the variational equations, for example 
from the equations (V-12) in the case of the linear variation method) : 

(0) Oonsidering as a function of the variation parameters and/ 

or functions, & = is an absolute minimum, ^ = is an 

absolute maximum and S'" ^ ^ ^ 5 ^ are only stationary 

points, neither maxima or minima. 

A 

(1) The Hk are real ~ this of course has been clear since 
Sec. II. 

(ii) If ^ and are automatically 

orthogonal, while if but 4*i< ‘ 4 ^ » i*e. if there is 


degeneracy 


. 


and can be chosen to be orthogonal. A 


general degenerate is then some linear combination of the 

A 

orthogonal ones. If further we assume, as we shall that the are 

normalized then we have 

(VI-9) 

' — ^ C' ^ 

(iii) Since 'Tu Mt. we have, from (i) and (ii) , 

that 


feu 4,) 5 


kU 
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or since \\ 


and ==■ '^L 


C^-K, 9-4u] = Ex- hux 


(VI-10) 


Thus in words: the are a finite orthonormal set of functions 


which diagonalize 


within the finite space which they span. 


A A A 

Vl, “ 


(VI-11) 


where is orthogonal co the finite space. Proof: Since 

the '4ic span the space we can certainly write . 

But -then from (9) and (10) and the fact that “D we have 

Auk =: ^ ^ therefore (11). 

(v) As a converse theorem^ if we have a set of functions 
with the properties (9), (10), and hence (11) then if we use them as 
basis functions in a linear variation calculation, the optimal trial 

A A 

functions will just be the again and the ^ t^ill be the 

, Proof: Let T be the projection onto the space of 

the 'AVv . Then we have H '^'u ss TT '+'u which from (11) equals 
EuTT • Thus the ?”k are eigenfunctions of 

with the eigenvalues, and this proves the point. 

Although the set of trial functions (V-10) and the "s-limit" 
functions each form a linear subspace, there is one obvious difference 
between tHem; the space formed by the former is of finite dimensionality 
while that formed by the latter is infinite. This has important con- 


sequences in practice^ £ 


e can fairly readily solve 


finite problems, particularly algebraic problems, to arbitrary accuracy 
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(•and the same is true for ordinary differential equations) , However 
really infinite problems typically involving the solution of non separable 
partial differential or integral equations, are usually intractable 
and make it necessary to introduce further approximations, although 
recently partial differential equations in two variables, such as 

2 

occur with the S-limit problem have begun to come under direct attack. 

Usually these further approximations consist simply in again using 

the variation method but with the ^ finitely, though not neces- 

3 

sarily linearly, parametrized subset of the infinite, linear space. 

Of course if the finite subspace i^ linear then the and E 

will be eigenfunctions and eigenvalues of '8’ H ^ where is 

the projection onto the finite subspace. 

Since the linear variation method leads to a finite problem it 
has been widely used and goes under various names: The Ritz method, 

the Rayleigh-Ritz method, the method of linear variational parameters, 
etc. As the name Rayleigh suggests its use predates quantum mechanics; 
it has been applied to all kinds of vibration problems, and quite 
generally wherever eigenvalue problems occur. 

In atomic and molecular problems one common application of the 
linear variation method is in the configuration interaction method 
(CI).^ Here, with H a fixed nucleus Hamiltonian, the 4^ are 
Slater determinants made out of given spin orbitals, (the spin orbitals 
often also involving non-linear parameters - see end of Sec. VII). If 
one uses all the determinants of appropriate symmetry which one can 
make from the given spin orbitals then one speaks of complete Cl; other- 
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wise one speaks of incomplete Cl. In this connection it is important 
to keep in mind that even with a modest number of spin orbitals the 
complete GI problem, though finite, may become- imjsractically large. 
For example if one has 10 electrons and 20 spin orbitals one can form 


Lol 10 1 


Slater determinants I Of course probably for reasons of symmetry not 
all of these need be used but still the numbers can become enormous. 
Thus partial Cl, involving a selection of (hopefully) the most impor- 
tant ’’configurations" becomes the practical alternative when 

one deals with even moderately complicated systems . 


VII. LINEAR SPACES AND EXCITED STATES^ 


/A . 

We have by now mentioned several times that all the 5vs fur- 
nished by the linear variation method have bounding properties. We 
now want to prove this. More generally we will show that whenever the 

set of trial functions form a linear space (having a definite S3nnmetry 

A 

if sjrmraetry considerations are applicable)' then the successive 

are upper bounds to the corresponding successive bound statei eigenvalues 

of H (of that symmetry) 


To prove this we first note that from (VI-9) and (VI-10) the 

> 1 t 

average energy in a state described by H-'— C tok 'rW is 

, ksLi 

K rt ^ 

5.1 


r Z ^ IL 

= -= 5 . 


f iwr 


(Vlt-1) 



n 


and therefore is not greater than where ^ is the largest 

W for which 0 . Further we note that there is at 

A 

least one linear combination of the first W which is orthogonal 

to the lowest (N-1) eigenfunctions of 14 (having the same symmetry) . 
From what we have just proven the average energy for this function 

A 

will be less than or equal to while from [6] of Sec. II it is 

certainly not less than S /yj the N'th smallest eigenvalue of H 
(of that symmetry). Thus we have, as announced, that 


£ i\j 



(VII-2) 


We will now show further that the bounds (2) are improvable bounds 
in that if we are dealing with a finite space, then enlarging the space 
will Improve or at any rate not worsen them. Thus as already mentioned 
in Sec. Ill, the linear variation method provides a soundly based method 
for approximating the higher eigenvalues of H. 

We start with a basis set of functions. Let us note this 

explicitly by writing CH) instead of . Thus in parti- 

cular (VI-10) becomes 




(VII-3) 


Suppose now that we add one more function ^ to our basis set. 

We may assume without. loss of generality that 4 ^ is normalized and 
orthogonal to all the , and hence orthogonal to all the ^ 






'=■0 


(VII-4) 
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and of course we continue to have 


4i_) = 


(VII-5) 


Let us write our new optimal ■ trial function as 
^ '*7' O' “0 i. 


(VII-6) 


where, for convenience, we will use the instead of the 

^ A A 

If we now insert ^ with and ^ ^ 

u=-\ 

arbitrary, into (V-3) and use (3), (4) and' C5), the following equations 
result : 

A 


C ^ IS ) V5 ■*ir ^ ^ L ■=- D 


and 


»A 


27 C4>, B t C 4^^ el U ^x> 

Ual 


(VII-7) 


(VII-8) 


Erpm (7) we then have 
-A 


V.U-- 


A 


which when inserted into (8) , yields an equation for ^ 


'E - Cc^, ^ ^ 


^ -CZ 


(VII-9) 


fe - isu. C-V^) 

If the fe ^ Lv^*) are all distinct and if none of the 

vanishes (we will shortly remove these restrictions) then -~CZ. as a 

function of IS obviously has the following properties: It has 
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^ A 

simple poles at Bs. is negacive Immediately 

to the left of the poles and positive to the right of the poles. It 

/A 

goes to zero through positive (negative) values when g tends to 
positive (negative) infinity. 

A 

The solutions of (9), let us denote them by B are then the 
intersections of JX with the straight line B-' C^f*, • The 

situation is shoim graphically below for M => 4. 



^ ^ (vii-io) 

and hence in particular 


B u, oahV c e'w (vii-ii) 


which is what we wanted to prove (for a more elegant proof see Appendix 

A).^ 
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A 


Turning now to the case of degeneracy among the , che 


qualitative picture isn't changed .since one can consider degeneracy as 

2 

a limiting -case of non-degeneracy. Graphically what happens is that the 

vV I 

appropriate' J segments becomes steeper and steeper as -successive 

fe C-T^) come closer together, and in the limit become vertical 
lines. In -particular then if (say) then £ 5 (‘S’!) 

will again equal ^ , though in general there would be no more 

degeneracy^ or more generally that if there is: an - fold degen- 

eracy among the Gu-Im.) at the value t: , then the fctc. 

will have at .least an n -1 fold degeneracy, also at the value 
However in any case (10) still holds. 

Finally let us consider the possibility that one, or more, of 
the vanish. Suppose in particular that in our 

example ([ becomes very small. This will mean that the 

sections on either side of the vertical assjnnptote at 

will hug the vertical assymptote more and more closely 
since the strength of the pole is being diminished. Thus we will 
have 



In the limit then as H'v) becomes very small we will evidently 

find '^5 COci and in general the vanishing of 

means that remains an as one goes on 

to the next stage; however Eq. (10) still holds. 
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Although it is primarily of theoretical interest, we will now 
compare the excited state bound one gets using only linear variation' 
parameters, to what one would get if one used linear variation para- 
meters, and in addition could also impose orthogonality to lower states 
as discussed in [6], Sec. IX. As might be expected, the latter pro- 
cedure, if it can be carried out, will generally yield a better bound. 
Consider the first excited state. Then suppose that instead of simply 
using an (M+1) dimensional basis set and trial functions of the form 


|AV' 


we 


further require that ^ *^1-) ■=•& , where 


u=l 


is the lowest eigenfunction of H: (with appropriate symmetry) .. 

t — ' 

Thus we can use this last equation to determine one of the 
for which C in terms of the others. Let this one be 

. Eliminating o-jah in this way then we- see that this 
procedure is equivalent to using as trial functions the, set 

= Z 1 3 Z 


Thus, this procedure corresponds to using the linear variation method 
with the H functions as' the basis set. We now note 

that if we adjoin the function to. the functions -we 

will effectively recover our original M+l dimensional basis, set and 
therefore it follows from (1) .with k = 2' that, in obvious* notation,. 

A A 

Te^ Cpi) 4 i£,_CMhO 
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On the other hand we also know from [6] of Sec. II that if is 

the 1st excited eigenvalue of , then 

< 8 , Cm) 


so we‘ have 




A 


.Cm-m)) 


(VII-12) 


which shows as expected that 'tlK) , -if we could calculate it, 

A 

would be a better approximation to E-v^ than is . Another 

proof of this theorem and of its generalization to higher states is 
given in Appendix' k '. 

The results which we have found in this section hold for any given 
choice of the . Ii;- practice one ‘often Imbeds parameters 

("non-linear parameters") in the and varies them as well. 

The reason for introducing non-linear parameters is that they are 
usually very effective in that one non-linear parameter can often do 
the work. ‘of many linear parameters. Thus a single parameter can 

produce an optimal exponent in S whereas it will in general take 

several terms to do as well by linearly superposing for example 

^ ^ -f t. etc. However the difficulties of dealing \7±th. 

non-linear parameters (see for example the paper by Handler and Joy 
cited in reference 7) coupled with the increasing power of modern- 
computing machinery often swings the balance in favor of more linear 
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parameters, i.e. more basis functions. When non-linear parameters 

A 

are used one usually chooses the parameters in each so as to ' 

minimize each separately. This* in -general will mean different 

A 

parameter values in each , however from -what we have just 

said,, the bounds are still valid though one has to be aware of the 
possibility of "curve crossing" as shown in the graphs below. 


No. crossing 




Thus in the curve crossing case illustrated if one used the two mini- 
mum values , one actually gets two guaranteed upper bounds to , 

rather than guaranteed upper bounds to and . Such pos- 

sibilities aside, one price one pays for having different parameters 
A 

in. different is of course that (VlTr9) to (VI-10) no longer 

hold, for L. and so , to this extent the are less 

like eigenfunctions of H ' than before. 

VIII. SEL7 CONSISTENT FIELD METHODS - INTRODUCTION 

In dealing with systems in which many particles move about at not 
too high densities and interact by means of long range forces , a 
natural, and one would expect quite accurate, approximation is to re- 
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place the detailed interactions acting on any one particle by a 

smooth field in which the particle is then assumed to move independently 

of the others, the smooth field representing the averaged effect of 

all the other particles calculated in some self consistent way (particle 

motions fields particle motions) . Such self consistent field (SCF) 

models i and various approximations thereto, have been widely used to 
, ^ 

approximate the behavior of nuclei, atoms, molecules, solids, liquids, 
plasmas, galaxies-, etc., so much that they together with the linear 
variation method comprise the bulk of the approximation methods used 
for atoms and molecules . 

In quantum mechanics the self consistent field- idea is made precise 
by using the variation method to determine the smooth fields. Most 
simply let be a fixed nucleus Hamiltonian for the atom, molecule, 

or solid under consideration so that we need consider only the electrons. 
Then in accord with the above ideas we associate a single spin orbital 
with each electron and use trial functions of the form 



\ % >-<v — 



f 


the use of a Slater determinant rather than a simple independent par- 
ticle product being required by the Pauli principle. The optimal spin 
A 

orbitals are then determined by the variation method. This 

procedure is known as the unrestricted Hartree-Pock procedure (TJEF) , 
and we will discuss it in detail in the next two sections. 

From such simple ideas however SCF methods have been developed in 
many forms atid varieties. Without attempting a complete review let 
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us note some of the reasons for this . For a more detailed summary and 
references to the literature we refer the reader to recent papers by 
Kaldor and Harris? by Kaldor? and by Larsson.^ 

(i) As we will discuss in Section X, for open shell states which 
on simple one particle models would be described by single 

determinants made up of spin orbitals of appropriate symmetry, UHF often 
fails to yield of proper symmetry, Ctotal spin, angular momen- 

tum, etc.)* To ensure proper symmetry one may then further restrict 
the in some way, for example in atoms one can require that 

the self consistent fields be effectively a central field which is the 
same for all the orbitals. Also, if the symmetry requires it, one may 
have to superpose several determinants (vector coupling) formed from 
such restricted spin orbitals. To put the matter more physically - 
already the Pauli principle which requires the use of determinantal 
wave functions, rather than simple products, is to some extent in 
conflict with the original independent particle picture. It is there- 
fore not surprising that requiring further "cooperation" among the 
particles' in order to ensure proper overall symmetry require further 
concessions. We should however point out that in making these remarks 
we have in fact inverted history. 'The restricted schemes were developed 
first. Later (for a brief review see Larsson's paper) there were 
reasons, both formal and physical, for relaxing the restrictions on 
the , total relaxation yielding UHF, but various intermediate 

stages have also been discussed and .used. 
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(ii) The- physical reasons for relaxing the restrictions, and 
therefore the symmetry requirements, often had to do with the fact that, 
say for open shell atoms in a simple central field model, the closed ' 
shells are quite inert,, particularly as regards magnetic properties, 
thereby yielding poor agreement with experiment. However since the 
symmetry properties (spin, angular momentum, etc.) are equally well 
experimental facts, it is natural that in addition to the restricted 
schemes, other SCF type methods have been developed which- do meet the 
symmetry requirements, however not by further restrictions on Sr' 
but rather by making 'V more flexible. In addition many of 
these schemes (for a review of the spin symmetry problem see< the 
papers by Kaldor and Harris, and by Kaldor^) stick fairly close to 
the original physical picture in that they still are based on spin 
orbitals, and most importantly for the physical picture, the number 
of orbitals (though not nec'essarily the number of spin orbitals) in- 
volved is no greater than N so that one can still make some 
correspondence between electrons and "states of motion". 

For example in some schemes of this type ("extended Hartree-Fock") , 

the trial f unction 5 _^e ■ of the form (1) with possibly some restrictions 

fi.n the but'^multlplied by appropriate projection operators 

2 

to enforce the desired symmetry. In such schemes then there are 
still only ^ spin orbitals. An example in which there are more 

than M spin orbitals but still only t\) orbitals is to be 

found in the use of "open shell" wave functions of the form 

1 
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'V ("V 

where are radial functions and a«>A- are spin 

2 

functions, to describe the ground state of helium. These are S 
functions but since they involve two radial functions they are more 
flexible than the restricted form 

\ Ji>\_ 

(which ip. this case is equivalent to UHF - see Sec. X) arid can therefore 
yield a lower fc (one does find that ^z.-^ ) . However one can 

still speak of one electron being in the orbital and the other 

A 

in 

(iii) The departures from the simple form (1) described in (i) 
and (ii) all stayed rather close to the original physical picture. 

The- third large class of departures (multi configuration SCF theorems 

3 

or MC SCF) though there is really no sharp distinction between these 
and those, of (ii) except possibly in spirit, tend to start with the 
more formal view that in UHF one seeks the best single determinant 
approximation, and generalizes this by seeking the best sum of two, 
three, ... determinants (perhaps subject to a ^±xori restrictions of 
one kind or another on the spin orbitals) . Indeed MC SCF theories 
can probably best be viewed as an economical Cl in that one attempts 
to fully optimise a few configurations, hoping thereby to do the work 

3 

of many more fixed configurations chosen more or less arbitrarily. 

In the next two sections we will discuss UHF in some detail. It 
is formally the simplest of the SCF schemes, but it serves to illustrate 
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most of the general features of these methods; and further it has been 
widely used. In Sec. X we will briefly develop another more restricted, 
but still rather general, SCF scheme, which is also of practical impor- 
tance, and which will illustrate some further formal points. 

One final note: In general the sum of two determinants is not 

a determinant. Therefore in UHF and the like we are not dealing with 
a linear .space. Nevertheless there is still sufficient linearity 
in that 







does equal 






A 


so that in some circumstances one can assert that certain higher S 
provided by the method furnish upper bounds to certain excited states 


of 




For details we refer to the original paper of Perkins." 


IX. THE UNRESTRICTED HMITREE-FOCK APPROXIMATION 


As we discussed in the previous section UHF is formally the 
simplest of the SCF schemes in that in this approximation one uses 


only a single determinant with no further restrictions. Thus one 
calculates ^ for y a single determinant 




*• 


(IX-1) 
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with 




typically of the form 




z Uu-) v Vst; 

s-=.1 


(IX-2) 


where g(st) = g(ts), and then determines the in such a way 

A 

that § & • (We are here taking the straightforward approach 

to the variational method as discussed at the beginning of Sec. IV.) 

Now in calculating £ one gets simple formulae ("Slater’s rules") 

1 

if the are orthonormal. We will now show that there is no 

loss in generality in assuming this. First, however, we will prove a 
more general theorem: Consider any linear transformation of the spin 

orbitals 


where the 


Vc'- 






(IX-3) 


^ are any set of numbers . It then follows that 




(ix-4) 


where is the determinant of the matrix A-G • Proof: Let 

us write out the left hand side of (4) in more detail, using (3) and 
explicitly introducing the particle labels. Thus 








2^ Nj '^3 
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which will be recognized as the determinant of the product of the 
two "matrices" and . Equation (4) then follows from 

the standard theorem that the determinant of a product of matrices is 
equal to the product of the determinants of the separate matrices . 

Turning now to the question at hand, we first note that there 
are many linear transformations of the type (3), for example che well 
knoxm Schmidt procedure, which, starting from a given linearly inde- 
pendent set of (and the must be linearly independent 

to start with or else 'H. 0 ) x^ill produce an orthogonal set of 

. From (4) then we have that '4' is proportional to 
the Slater determinant formed from the C , and since the pro- 
portionality constant will simulv cancel out in calculafcine 


will simply cancel out in calculating 


£ , we have the desired result. Notice however that the 

are certainly not unique since given one orthonormal set any unitary 

transformation will produce another set, and from the theorem which we 

have just proven, this new set will yield essentially the same ^ 

Assxaming then that the are orthonormal. we will now derive 

A 

the equations which determine the , the optimal spin orbitals. 

A 

Applying Slater's rules one finds 


I ^ r VY,) i f ? t c j (ix-5) 


We now vary the. spin orbitals and set S t “0 . Then after a bit 

of rearranging we find 


Q ^ 'z I c? 3 


(IX-6) 
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where the operator 


is. the so called '■Hartree-I’oc'k single 


particle Hamiltonian'' and is -defined by its action on an arbitrary 
spin orbital according to 

CS) TC 4.-)= Ufo TWO <^'$3 , -t£f (ix-7) 

It is, then easy to show that i.e. that is 

Hermitian . 

In deriving (5) and (6) we have assumed that the 3-^® 

, 

orthonormal. Thus we need not require that (6) be true for all >V5 

A , 

(in fact it would be in general impossible to find non-zero Yj 
which satisft^ . this), but rather we need only require i't to be true 
for all ) which also satisfy the further conditions 

’?5‘) “'» , i-e- 




(IX-8) 


To take account of these constraints we introduce Lagrange multipliers 
.(see Appendix B) and replace (6) and (8) by 






or, rearranging a bit 




(IX-9) 
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which _i£ (see Appendix B) to be true for all • 

We now note that the right hand side of . (9) may be complex and 
hence its real and imaginary parts must vanish separately. In parti- 
cular then since 4 Vo'^ ^ obviously real 

means that 

o V ^ ^ )'^0 ^^0 ^ 3^3 


which, after some rearranging, becomes 

0 - T I [ C& v.r, ®':.' 4 ; Y C } _ C 

5. A 

Since this is to be satisfied for all vVj , it must in particular 
be satisfied by 






where is a small pure imaginary number. Inserting this in 

(10) then yields 

0 ^ C ^ ^i.v) 'ftc ^ ^ 

s- 

which implies that 

C&vv— ^ TO 

U 

A 

or, since the VK are supposed to be linearly independent 
^/ KL . ~=^'0 


(IX-11) 
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i.e. the Lagrange multi~pliers must form a' Heriaitian matrix , and clearly 
this is. not only a necessary condition- to ’satisfy (10) , it is also 
sufficient-. ■ ■ - 

Assuming (11) then, (-9) becomes ' - 


0 •=: 


■z: 

d 


[ 










from which, following what should by now be a ,f amildar pattsrn, we 
are led to . ‘ ■ • - 




t'"'' ■ 

'-3 


(IX-12) 


where the S are to be' determined in such a way as to ensure 

A, 

the. orthonormality of the- Vv . ‘ 

(A 

We now note that., conversely, if the Y C are “o.rthonormal, , 
then (11) will automatically be satisfied, since then from (12) 


n 


6^^ Cv.,, -i-V,-) 


'(lX-13) 


.which since 
■ 

IS 


9 ,-A Jf- 

'j x's Hermltian, equals ( Spj^ -r ri) which in turn 


_ . Further we note that* one choice of the Lagrange multi- 
pliers which will guarantee orthogonality is to put for 

A 

C'%'1 » since then the will all be -eigenfunctions of the same 

Hermitian operator namely ^ , The- diagonal elements of 6^ 

then remain to be used to enforce morma'lizationi • Using -just a single 
subscript for the .diagonal elements then’i the equations' 






(IX-14) 
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. define the "canonical" UHF spin- orbitals ; and clearly from the form of. 
A. 

X as given in .Eq. (7), these equations do conform.to; the physical 

picture of Sec . VIII though admittedly the exchange terms., the terms 

A 

with the minus, sign and in -y , do not admit of a simple 

physical Interpretation. 

It should be pointed out. however that the canonical spin orbitals 
are not necessarily the most usefully- ones physically. Other sets of 
spin orbitals derived from these by a unitary transformation may have 
more desirable properties,, for example they may be better localized.^ 
Also it has been suggested that certain non-orthogonal sets may also 

O 

be useful. We will not pursue these, matters further here, except to 
note that considerations of this sort are not limited to . UHF, they can 
be applied to- the sp in orbitals of any Sla ter determinant. 

Equations (14), when^they~ can be solved at all (see .below), are- 
usually solved- by an iterative process. One first "guesses" a set of 


orthonormal- spin orbitals, call them- 


. From them one con- 


structs, in an, obvious way, a first approximation to , call it 

i 

■ . One then proceeds to solve 

X 2 _ 

f f, ..6c 'fc 


which is an ordinary eigenvalue prob-l’em for . 'Thus the solutions 

are- automatically orthogonal if there is no degeneracy, and if there 
is- degeneracy they can be chosen orthogonal. . Also they can be normalized. 
Gne then proceeds to ca-lculate S' etc, etc, stopping when (hope- 

fully) a sufficient degree of self consistency has been reached, that 
is when the are sufficiently similar to the 
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Since 'T involves all .the (14) is really a set of 

coupled non-linear integro-partial-diff erential equations . Eor -real 

atoms and molecules it does not seem possible to exhibit closed form' 

solutions, so other approaches must be used,. If one can reduce the 

equations to coupled equations in one variable then a direct iterative 

4 I A 

numerical attack is possible. In particular if is the non- 

.relativiqtic Hamiltonian for an isolated .atom in the .fixed nucleus 

approximation then, as we shall discuss in the next section, one can 

A 

often find solutions of (14) in which the have the spin and 

angular dependence that one expects' 'on the basis of the central field 
model of the atom, with only the .radial dependence remaining to be 
determined from (14). However even for the .simplest of molecules, 
that is diatomic. molecules, the most that one can hope to get "for 

A 

free" is the dependence of the , ,on the azimuthal angle around 

the internuclear axis, and therefore one .is still left with two inde- 
pendent variables to deal with. 

To get a finite problem the standard procedure^ is to further 
A 

restrict the , by requiring that .they may be expandable In 

finite basis sets (which may contain non 'linear parameters , however 

g 

we will not consider -them explicitly.) The optimal values of .the 
expansion coefficients are then determined from (9) . We will refer 
to this procedure as analytic unrestricted Hartree-Fock (AUHF) . 

Similar analytic approximations can be, and regularly are, made to. 
other SCF and HCS^CF type approximations. Thus we write 
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^ X 

hy:}-=- ^^^3 '^‘i 

a 

where to avoid notational complexity we have required that each V j 

g 

be expandable in the same, finite basis set. (Note however that this 
still permits different Vj to have different symmetry; one 
includes among the functions of various symmetries so that if 

the appropriate ^ turn out to vanish, V 3 will have a 

definite symmetry). Then (9) becomes 

O' - f I ^ Sc c 2^ -- f 6c,l 

r ^ A gf. (IX— 16) 


where now 


? 1>i= ^ u f t 


At this point it is helpful to. introduce some matrix notation. 
Thus we introduce .the M x M Hermitian matrix whose elements 

are C'Ha>-V 7>i) and the M x M positive definite Hermitian "overlap 
matrix" S whose elements are 0^^ , and finally we intro- 

A, 

duce N , M element column vectors whose elements are 

In terras of these quantities then (15) can be written 

j / . li ^ ^ J 

i 

where ^ y, means the usual scalar product of two complex 


vectors 
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(lXr-19-") 


Also in this same notation the orthonormality requirement 


A (\ 


<S. 


(IX-20) 


becomes 


1 'tc.Sxa = S.-- 


(IX-21) 


From this point on the argument proceeds much as before. Indeed the 

steps are identical if we introduce the vectors ’0j and the 

Hermitian matrix ^ defined by 
A 


‘ /V 


;D, ^ H H 


= a ^ ^ 


(IX-22) 


so that (15) and (21) become 


-A V ■ V 


and 


i^c, T)5\- 


which have precisely the same structure as (9) and(2Q). Thus we are 

again led to (11) and instead -of (14) we have 

A 


A ^ 

^ Dc = fee 


or from (22) 


-i tc 


A 

6.. S t; 


(IX-23) 
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Incidently, recalling th§ discussion in Sec. V, not-e that these same 
equations can also be derived by taking "moments" of (14) . That is 
if we substitute for the from (15.) into. (14) (including ) 

l 

and take the scalar product of the. resulting "equation" with 
one arrives at (23) . 

The equations (23), with (17) constitute a set of non-linear 

A 

algebraic equations for the I Again the usual solution procedures 

are iterative. One chooses some , computes ^ , solves 


^ t, - 3-C, 


(IX-24) 


4 


as an ordinary algebraic eigenvalue problem, calculates T etc. 

etc. For the details of practical procedures for doing this for atoms 

as well as a discussion of a treatment of non-linear parameters we 

9 

refer the reader to the article by Roothaan and Bagus. For molecules, 
especially for large molecules, this procedure may still not be 
practical at all, or it may only be practical only using the largest 
of computers, because of the sheer number of integrals which must be 
calculated to evaluate and often, depending on the nature of 

the '^ 51 ^ because of difficulty in calculating them. These difficul- 
ties, in addition to spawning large literatures on the choice of the 
Xci , and bn integral evaluation, have also led to the development, 
of many methods which iurther approximate ^ in some way or other, 

however we will not attempt to review these methods here.^^ 

A 

Thus far we have not looked at E' , and that it has not emerged 

r\/ 

automatically is of course a consequence of the fact that our 'V all 
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A 

e 


had a fixed scale t CS') ^ Clearly it would he nice, and 

would yield a simple physical interpretation of the if ^ 

were to equal "2T Si . However this is not the case since from (5) 

we have 

^ X v£) ^ ^ ax-25) 


or 




r r L 

2* i 3 


1 


1 


-y * ^ 'r -V 

In this form the first line will be recognized as ^ Cv V ’ 

which therefore from (14) and the fact that the are nomalized, 

does equal 2. . However there is still the second line which 

is clearly just the negative of the average of the tt<ro.body interaction 
J. XX .. Therefore we have 

Z £c _ C4-, (IX-27) 

US.T 

so that, as we said, ^ is not just the sum of the €(, (an 
analogous result holds in AUHF) . The" physical basis for this result 
is of course the fact that counts each electron-electron 

interaction twice. Thus one cannot immediately ascribe physical 
significance to the S.I as "one electron energies". However there 
is a good empirical correlation between the and ionization 


energies , a correlation that is supported by the following theoretical 
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13 

result known as Koopman’s theorem: Suppose that as an approximation 

to the optimal UHF function for the N-1 electron ionized system one 
uses the determinant gotten from by deleting the k'th spin orbi- 

tal. Then the & for the N-1 particle system will differ from 
^ in (25) by the removal of all terms involving Y*' • However 
these terms are precisely 

4-V z t 3 (.% >3 

or 

or finally 

£• =:, ^ 'Yu.') — 

Thus in this approximation Gv< is. precisely the ionization 

energy.^ Of course this kind of trial function for the N-1 -particle 

system with so to speak "frozen orbitals" would not seem to be a very 

good approximation, especially if inner shell electrons are being 

ionized; presumably it would be better to do a complete UHF calculation 

B 

for the N-1 particle system. Nevertheless the result does give some 

14 

feeling for the empirical situation. 

X. RESTRICTED HARTREE FOGK AND OHF 

If for an isolated atom or molecule, simple one particle models 
predict that single determinant closed shell states arc possible then 
one can show that the UHF equations do have solutions of the expected 
S3nmnetry.^ ^ Thus for two electron atoms (with H the nonrelativistic. 
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fixed nucleus, spin free Hamil-tonian) one can find solutions of the 

form / 

1 ^ (X-1) 

and similarly for ten electron atoms one can find solutions of the 
form 

\ jS. ^3 u 

However when one goes. to open shells the situation changes in that, 
considering atoms to be definite, even though a simple central field 
model would allow the state to be described by a single determinant 
(i.e. no vector coupling needed), still as we mentioned in Sec. VIII, 

UHF will not in general have solutions of the expected symmetry. 

Thus, confining ourselves for the moment to s-orbitals, for two electron 
systems we can find solutions of the form 

\ ol 1^.2. ot i (X-2) 

which is a member of a spin triplet. However for three electron 
systems it is easy to see by trying that though there are pure quartet 
solutions there are no pure doublets 

\ ct, |V 

but rather there are solutions of the form 

\ f i Rv ^ d.1 (X-4) 

which is a linear combination of doublet and quartet. 

A similar situation exists with respect to orbital angular momentum 
when one investigates what would normally be single determinant open 
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shell states involving orbitals of non zero a ^ular mo mentum . for 
2 ^ 


example P for three electrons. One find3^ solutions which are 

eigenfunctions of but not of L • Further, although the 

orbitals are eigenfunctions of , the component of one particle 

’ 2 


angular momentum, they are not eigenfunctions of . -..'In Sec. VIII 

xje briefly sketched and gave references to the responses which have 
been made to this situation. Here we want to pursue the' restricted 
Hartree-Fock approach in more detail; 

Applied to closed shells, and to be specific let u's consider the 
two electron atomic. example, it would consist in restricting the 
to be of the form 


from the outset, and then determining ^2. from the variation 
method. Since as we have said, DHF does have solutions of this 
type it follows that for closed shells the restricted Hartree-Fock 
functions satisfy the UHF equation. We mention this because this then 
implies that the RHF functions for closed shells will also satisfy 
the many interesting special theorems which are satisfied by UHF 
functions, theorems which we will be discussing in Sec. XII. 

Turning now to open shells, consider again the three electron 
doublet example. There one would use trial functions of the form 



More generally let us consider trial functions of the form 

I \T) ^ 1, 


(X-5) 
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where we do not restrict the form, of the orbitals ^ s i.e. they 
need not be simply radial functions. Since adding a multiple of 
to will not change the value of the determinant we may assume 

that and '’v are orthogonal and of course we may assume 

that each is normalized. Then if h and g are spin independent 
one readily finds from (lX-5) that 


£ 






§ £ =0 subj ect to the conditions Cvv . ^ ■)= then leads to 


and 


0^-7) 


U%Tv 1- C-V-^ ~ )C^, — ^ 


Tv Gvt/dV 


There are now two main points which we want to make? 

(i) Without trying to write these equations in very elegant 
form it should be clear that we cannot ensure the orthogonality of 

A /V 

V“j and by simply putting “O since 'and 

M\, will not then be eigenfunctions of a common Hamiltonian. Rather 
one must deal with the off diagonal Lagrange multipliers explicitly 
and this is a^eneral feature of restricted open shell SCF calculation^ 
Various procedures have been devised to do this .and we will simply 

3 

refer the interested reader to the original literature on the subject . 

Also direct search procedures, such as we mentioned at the outset of 

4 

Sec. IV, which effectively bypass the equations, have been used. 
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(Such procedures are. anjroay almost invariably used to find the optimal 
values of non linear parameters) . 

(ii) It is easy to see that the equations do admit solutions of 
the form (3) . Thus the restricted Hartree-Fock function for this case 
will share the general properties of solutions of (7). More, generally 
we will denote (for no particular reason) by OHF the procedure, in 
which one deals with trial functions of the form 

rv. 1 

1 \r^ (k (Tt) p \r^Di ~ — \ 

which ar^eigSif unctions of 




with eigenvalue and of 

" 2 <^ 

^5 with eigenvalue ^ ^ discussion above 

then being for the special case N '=.3 • One can then show 

that restricted Hartree-Fock functions for atoms and molecules de- 
scribing closed shells plus a spatially closed shell with all spins 
aligned will satisfy the, OHF equations, and therefore will also 
satisfy the theorems for OHF which will be discussed in Sec. XII. 
However since no essentially new points of principle arise we will 
not xiTrite out the detailed equations for general OHF and AOHF. 


XI. THE GENERALIZED BRILLOUIH THEOREM 

A 

In general whatever sort of trial functions one uses, any ^ 
will almost certainly be only an approximation to an eigenfunction 
of V\ , and so the question naturally arises , how can we improve 
on. -the -approximation? One approach of course is simply to enlarge 
the- set of trial functions in some way. Another would be to use 



55 


Raylelgh-Schroedlnger perturbation theory., and it is this approach 

which we want to discuss in this section. 

In order to use RS perturbation theory we must introduce a ■zero 

A 

order Hamiltonian Ho which has y ' as an eigenfunction: 


^ S- 'f' 


(XI-1) 


where "S , the eigenvalue, may or may not equal ^ . Having 

chosen^ one now treats as a perturbation. The first 

A 

order correction to is therefore (we will assume that ^ is 

non-degenerate and as ■usual will use a discrete notation; also we will 

A 

assume that is normalized) 




Za- z. 


(XI-2) 


.where- the are the other orthonormal eigenfunctions of >• 

and where the are the corresponding ■eigenvalues. Finally 

using (1) and the fact that the G*< are orthogonal to H' 
we can write (2) as 

C J (XI-3) 

Note however that this perturbation scheme involves a great deal of 
arbitrariness since if we write 

Ht,= f-JH ■&.■)(,©., I (XI-4) 

then (1) will be satisfied whatever we choose for the and 

the 0,< so long as the .latter are orthogonal to . However 


different choices of the 
changes in ^ 


and the 


can make profound 
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We now note that as a consequence of the variation method, 

certain h 4'') may vanish so that the corresponding 

'Vt') 

will not occur in though of course they may appear in t' j , 

etc. Namely suppose that xjith an infinitesimal, but other- 


wise arbitrary complex number. 


S'*! 


(XI-5) 


is among the variations of possible within the set of trial 

functions. Then it follows from (V-3) and the orthogonality of ^<>1 


and that 


( 0 *, 


(XI-6) 


which from (3) tells us that 0^ will not appear in "+ .In 
the context of UHF where as we will see the satisfying (6) 

are one-electron excitations of < , this is known as Brillouin's 

theorem.^ (In our earlier discussion of UHF we did not specifically 
invoke Eq. (V-3). However it clearly must be satisfied since in UHF 

*»S. 

one imposes no a priori reality restrictions) . Therefore x^e will 

call it the generalized Brillouin theorem. More precisely, and quite 

apart from its application to perturbation theory, the generalized 

/ 

Brillouin theorem is the following: Xet Mr' with an infinite- 

simal but otherwise arbitrary complex number, and with ^ orthogonal 


to 


be a possible variation of M' within the set of 


trial functions, then it follows from (V-3) that 


( 4 '^^ ^ 47 


(VI-7) 
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In -many ways then the generalized Brlllouin theorem is really not a 

new result but merely a restatement of the variation method and is so 

2 

used by many authors. 

Returning to perturbation theory, let us in particular consider 
UHF . Then a natural choice for is ^ ^ 


AJ 


- p = . Z 


(IX-8) 


Since ^ is a one electron operator this means that the are 

single determinants involving 1, 2, . ,N electron excitations of 
(For ground states of neutral systems these ©o^ are usually in the 

4 • - 

continuum) . We will now show that the one electron excitations do 
not appear in t (and therefore do not affect the energy till 
fourth order). Proof: 

we have 

where .the . are orthonormal spin orbitals, and where we have 

•included the factor V'HJ! so that will be normalized. 

Therefore the most general $'4' . is of the form 


A 

For Y 


(XI-9) 


-I z \ <f, -.Y„\ 

tut ^ 


(XI-10) 


where the arbitrary. If in particular we choose the 

to be orthogonal to the but otherwise arbitrary, 

A 

then (10) is an arbitrary Siam of one-electron excitations of 
which proves the point . 

J 

Similarly for OHF the v which satisfy the generalized 


Brlllouin theorem are a general superposition of two types: One 
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electron excitation of the singly occupied orbitals without change in 
spin, and paired excitations of the doubly occupied orbitals, again 
without change in spin. Thus for the example (X-5) these types are 
^ ^ VT, ^ 

and 

A 

where for orthogonality to S' we need 

C ^ UV j — C St?!, jO'O ~ O 


The second type of excitation may then be further classified according 

c. 

to whether or not is orthogonal to • We will discuss 

the linear variation method later. 

Returning now to UffiF, the fact that with the choice 
S' contains no one electron excitations of has- an interest- 

9 

ing consequence: there are no first order corrections to the average 

value of any one electron operator'^ or equivalently there are no first 
order corrections to the one-electron density matrix (Indeed the one- 
electron density matrix is itself the expectation value of a one 
electron operator), and hence to its eigenvalues, and to its eigen- 
functions the natural spin orbitals. Proof: Let \K/ be a one-electron, 

operator- Then the first order correction to its expectation value 


W 4') -h W 

or 

4 - 

A V> (N 

which, since from (3) S' is orthogonal to S » oan be written 
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(X.-12) 


However if \\I is a one-electron operator then, either from Slater's 
rules or from _ ' , , 

^ 4' f ^ %, — -Vw 1 

one readily sees that 

and- I V/^- C.’^j 4) 3,4 

O' 

involve only one-electron excitations of H' • so that (12) vanishes 
as claimed . 

Because of the freedom in the choice of this- result, 

though interesting, is rather more formal than physical. Thus for 
UHF we could certainly choose- the and hence. ■ so that 

none of the &J. - was a pure one-relectron excitation. In such a 
case then Brillouin's theorem would have' no especially interesting 

r ^ » 

consequences for UHF'. For UHF applied to atoms, a more physical resu-lt 
can be derived as follows: 

We want to consider an isoeiectronic sequence in the limit that 
nuclear charge”^-'"^ . Therefore to keep things under control 


we 




will, as usual, use scaled coordinates- ^ '■5' and measure all 
energies and Hamiltonians in "units" of . For H we take the 

usual fixed nucleus. Hamiltonian (divided- by ) possibly including 

external fields, and for we will again use . Then 

since and • differ only in their treatment, of the' Coulomb 

interaction between the electrons , and siijce in scaled coordinates , 


*t'he .Coulomb interaction is' of order 


it follows that 
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CH - 0 ) 


(XI-13) 

the normalized eigenfunction 
of Vi to wt^ich is an approximation and write 

0- 


and that therefore, if we denote by 

/V 


then A 


(XI-14) 

Further since — 


4 . -f- A 

is at least of order 
c4-,4) we have 

C4,A) 4-G.,'vJ.) + CA,A) =-0 

i.«. cyjAO •>- c-c,4 ) is at least of order (1/^^. 

Now consider any operator o such that in scaled coordinates 

/ 


Llm O ^ 


(XI -16) 




,/ 


where is a one-electron operator and where C> is at least 

of order 1/^ with respect to O . (Thus H itself is an example 
of such an operator) . Then correct through terms of relative order 
IM^ the exact average of O is 

^'V, + d'4'y O A) 4- o 

which to the same order can be written 

since as noted above 

’ is of relative order 1 /^ . But now C>j and O j are one-electron 
operators and since from our previous formal result we know that to 
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first order in ( ), and therefore at least through first 

A, 

order in l/%, contains no one-electron excitations of 

the last two terms in (9) vanish to this order. Thus we have 


( ^ C4, o ) [l 0 


(XI-18) 




That is averages calculated using a {JHF ^ for any operator 
satisfying (16), and hence in particular for any one electron operator^ 
are accurate through terms of relative order 1/5E. Since, as we haye 
noted, satisfies (8) it then follows that, if we reinstate the 


factor and write 




■O^ » 

X, ‘T’" 




and if we write the corresponding eigenvalue in a similar way 

g ^ ■X" E. { f h 3 . 

then _ gj. and 6^ - c.. but U, • Thus the correlation 

energy £• — is in first approximation independent of v. Note 
also that these results for UHF are true in arbitrary external fields 
and therefore, hold for all manner of polarizabilities, susceptibili- 
ties, etc. (For calculations which, among other things, illustrate 
these points we refer the interested- reader .to. a series of papers by 
•A. -Dalgarno and collaborators, .notably M. Cohen, which have been 
published mainly in Proc. Roy. -Soc. and Proc. Phys . Soo. starting 
around 1960) . 

Having said all this in great detail we now want to make tv70 
further pointy one major and one minor. First the minor point. 

•Having understood the derivation in detail it is clearly possible to 
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simplify it drastically; In the c<i limit 'f' differs from 

'V by terms of relative order 1/X, Therefore for any operator 
O which in the*^"^ limit becomes a one-electron operator, it 
follows from our earlier formal result that the 1/&, correction to 
its UHF expectation value vanishes identically, which of course yields 
(18) . For 0 ■=»■ H j (18) can also be inferred directly from the 


variation principle, i.e. 


involves only a second order 


error. In Sec. XVI we will give another derivation of these results, 

a derivation which will show that (18) also holds for OHF provided 

that and 0| are both spin free. 

Now for the major point. This is that the preceding arguments 

contain a flaw, and are not completely valid! The reason is connected 

with the peculiar degeneracy of hydrogenic energy levels. Thus for 

example consider the ground state of the Be atom in the absence of 

external fields. In the ^ limit ^ which is a single de- 

2 2 1 

terminant, becomes the function (Is) (2s) S. On the other hand the 
correct result is a certain linear combination of the degenerate 
pair (ls)^(2s)^ and (ls)^(2p)^ Thus something is wrong with 

A. 

our argument since and don’t agree even in zero order! 

Incidently note that the MCSCF schemes can avoid this difficult]^ 
and indeed to some extent the MCSCF method was first introduced to 

7 

deal with this problem. The difficulty is of course that our 


f t 

estimate of the order of '-K 


is incorrect since as ^ 


there xs an energy denominator in t which .instead of being 

1-60 

of order 1 is of order 1/^ like the numerator^ thus making y 
actually of order 1 rather than of order 1/X, and similarly for 


etc. 
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2 2 

On the other hand for an isolated Lithium atom the^ (Is-) (2s) S 

and (Is)^ (2p)^^ degeneracy causes no problem since they have quite 

different symmetry, but in external fields Li becomes a case in point 
0 

as well. However it should be- noted that in any case our earlier 
formal argument is still generally valid. More precisely if we in- 
troduce an order parameter and write 




then corrections of order X to one-electron' properties do vanish 

rvv 

without exception. However in cases like Be the terms of order ^ 
are not' also of order (1/Z) 

Well what can one learn from all this as regards the accuracy 
of UHF? Since none of the arguments applies to two, three, etc. 
electron operators one expects^ and one ' find^ that expectation values 
of one electron operators or more generally operators like H which 
satisfy (18) are given more accurately by UHF than* expectation values 
of two electron operators. However the .arguments about the order 
of accuracy are a bit shaky. We have already noted the formal 
character of our first argument and the 1/Z statements for atoms 
strictly apply only in the"^"^®^ limit and therefore not to neutral 
or near neutral atoms . Nevertheless as a general rule UHF does quite 

Q 

well as regards one-electron properties, however there is definite 

evidence that for some one— electron properties of molecules, second 

■ and higher order corrections are not negligible and^ in particular 

-A 

that one electron excitations of which, as we have seen, don’t 

T~' 

appear in t with the choice of r , but -which can appear in 

A (f<2.S ■ TO 

vj/'” etc., can have a non-negligible effect. 
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Turning briefly to another case, let ^ be a Tk of the 
linear .variation method. Then a natural choice for is ^ 

or some linear combination of H and ( l-rTV ) ( l—Tf ) • With 

this choice, all the H'U. are eigenfunctions of V)(? 

and therefore from (VI-9) and (VI-10) (i.e. the 
generalised Brillouin theorem) it follows that the Ti_ with 
will not appear in *4^ . Finally let us note that a common 

(L 

means for improving a ‘t' Is to do a further linear variation 

calculation with the basis set consisting of V and 

some other functions . The generalized Brillouin theorem 

then tells us that if a satisfies the conditions of the theorem 

then it will not be directly coupled to t in the secular equation. 

11 

Also if, as is becoming Increasingly popular, one solves the linear 

variation problem by a perturbation technique then, with appropriate 

choice of Hp , will not appear in and will not affect 

^ ft*') 

the energy till t '' 


XII. SPECIAL THEOREMS SATISFIED BY OPTIMAL TRIAL FUNCTIONS - INTRODUCTION 

' ' ” ' rri- r-i -I -.inu- ■> 

Eigenfunctions of H satisfy various physically interesting 

and useful special theorems — Hypervirial theorems, generalized 

Hellmann-Feynman theorems, etc. and may have certain symmetries. In 

this section we will show how one can choose the set of trial functions 

in such a way as to be sure a' priori that the optimal trial functions V 

will have analogous properties. These theorems, when applicable, 

A. 

can then provide physical Insight into the nature of the *4^ and 
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£ ^ and the degree to which they approximace the behavior of 
actual eigenfunctions and eigenvalues. Also if one cannot determine 

A 

and 0 exactly, the extent to which the applicable theorems 
are satisfied by the approximate' 7 and can give one an 

indication of how accurate the approximation is; for example how 
accurately an AUHF calculation approximates UHF. 

Of course one approach to symmetries is the one which we have 
mentioned repeatedly - constrain the so that each has the 

desired S3nnmetry. Also in the last few years there has developed 
a considerable literature in which, usually through the use of 
Lagrange multiplier techniques, the t are constrained to have 
various properties and satisfy various theorems.^ However in what 
follows we will be interested in more general possibilities in which 
the symmetries and/or theorems are satisfied "naturally". 

In all cases we will give only sufficient conditions, and it 
seems that one can hardly do better than this in any useful way 
because any set of 4 ^ might contain^ as one unique member^ an eigen- 
function which had all the desired properties . Hox^ever it is 
empirically an excellent rule of thumb that if the sufficient con- 
ditions we give are not met, then it is very likely that the y 
won’t have the desired properties. Presumably this is the case 
because the sufficient conditions which we will give are rather 


natural ones. 
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XIII. REAIITY 

If, in the representation in which one is xjorking, H is 
explicitly real then, as is well known, if an eigenvalues is non 
degenerate, the corresponding eigenfunction will automatically be 
essentially real (i.e. 'V ^ oC where is some constant) 

while if the eigenvalue is degenerate the eigenfunctions may be 
chosen to be real, an arbitrary degenerate eigenfunction then being 
some linear combination of the real ones. 

We will now show that if i4 is real, and if the' set of trial 

A 

functions is invariant to complex conjugation. 


then if & is non degenerate, will be essentially real. 


^ is anyway real, the assumption that 
that 


We first note that since 


is real implies 


A 


> C 4, ^ C 4,v>4;’* ' ^ (>5?, 


(XIII-1) 




Thus and ’ yield the same energy. However under our 

4 (\^ 

and are < in the set of 

A 

trial functions, and therefore if is non degenerate it follows 

that ^ and must be proportiona'l to one another as we 


wanted to prove. UHF and OHF and the linea r v 


; 


:ion method if 


the 




can be chosen real, are examp les""^»se^ of trial functions 


which are invariant to complex conjugation. 
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If ^ is degenerate there does not seem to be a simple 
general theorem unless one has a situation in which the trial functions 
can be labelled in such a way that they are complex only because 
certain linear variational parameters and/or functions can be com- 

1 C 

plex. Then we c an show that t he '-r are either automatically 
essentially real! o^can be chosen real, an arbitrary degenerate 
then being some linear combination of the real 
Proof: In such a case since, for fixed values of the non linear 

quantities, the real functions C.H' d- y and ^ j 

also belong to the set, it follows that as far as varying the linear 
quantities is concerned we can deal with a real basis set. For 
fixed values of the non linear quantities, the optimal trial functions 
are thus eigenfunctions of a real W . Therefore if there is 
no degeneracy at this stage, the optimal function is real and hence, 
under our assumptions, will stay real as one determines tlie optimal 
values of the non linear quantities . On the other hand if there is 
degeneracy at the linear stage then we can anyway write our trial 
functions at this stage as a linear combination of say real 

V 



functions '4'^ which, from (VI-9) and (VI-10) satisfy 

^ E (XIII-2) 

v' 

and where and the depend on the non linear quantities. 

But now going on to vary the non linear quantities one sees that a 

rv /R 03 V 

general 4' , which will now have the form , , yields 

i 

the energy ^ whatever the values of the Q,ti{ 

(2) we have ¥ 


since from 
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ric^r 


<v« 


Therefore the degeneracy persists and the Q.o( afe left undeter- 
mined by variation of the non linear quantities. Thus, an atbitrary 

A. A. 

degenerate will be some linear combination of the real "^55^ , 


which completes the proof. 




One final note: Whether or not was 


in the 


set of trial functions , one can in any case produce optimal real 
functions by doing a further linear variation calculation with 'x 
and basis set. Proof: Instead of using and ^ 

as the basis for this further calculation we may equally well use 
the real basis set and ^1.3 1 . Therefore it 

follows from the above discussion that the result will be two real 
(orthogonal) linear combinations of and » and of course, 

as a bonus, at least one of these combinations will have an energy 
less than (or at least not greater than) the original ^ . Indeed 

even without a further detailed calculation one' can see that one or 
the other of the real functions ^ or ^ will themselves 
yield a lower energy, or at least an energy which is no higher than 
^ Proof: From ( 1 ) and the fact that 4 ) =■ one 

can easily verify that 


which is certainly not less than the smaller of and 

C>t,Vi¥.v) • However in general neither nor will 

be optimal functions in the sense of this paragraph. We will discuss 
this point further in Section below. 
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XIV, UmTARY INVARIANCE 


The eigenvalues of r® are invariant to a unitary transformation 
of . Let ^ be a unitary operator, and suppose that one 

uses the same set of trial functions in the variation calculation for 
3-s one uses in the calculation for . Then we can show 

invariant to the 

transformation \J 'Hsa. fe , 


that if the set of trial functions is 


find the '6 


^ . The proof is as follows: To 

we look for the stationary values of 


O' 


as 


cf, H / C'?-,'?) 

ranges through the set. To find variational approximations 
to the eigenvalues of x^e look for the stationary values of 


which, since 


u 


is unitary, we can write as 




If now the set of ^ is the same as the set of 

it is clear that the stationary values of c and will be 

the same, which proves the point. Note that if the' set is ^ 


invariant to u then it is 
of O and to . 


invariant to all 


powers 


In particular a spatial translation is a unitary transfomation, 
the operator V being t » <P where is the amount of 
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—$■ 

the translation and (P is the momentum operator for the particle 

or particles being translated.^ Thus if the set of trial functions is 

’liBivariant to a rigid translation of the electronic coordinates then one 

will get the same £ whether one uses Vt as given or whether 

in H one replaces the 'ns. by 'r^ -r where ©- is a 

constant vector, leaving everything else unchanged. Similar remarks 

apply to rotations where • T with © the angle 

(and axis) of rotation and T ’ the angular momentum of the rotated 

particle or particles?^ and to inversion where is the product 

of inversions applied to each particle separately. 

Since the operator 'O for a rigid translation^or rotation^or 

inversion of all the electrons takes the form 

AJ 

O ^ ^ (XIV-1) 

Sr=.l 

they transform a Slater determinant into a Slater determinant: 

Therefore UHF is invariant to rigid rotations, translations and in- 
versions, and to any other of this type. Further if we consider 

only spatial rotations, i..e. we don’t rotate the spins, then since the 
Il's for such rotations, and for translations, and inversion are spin 
independent it follows that OHF is also invariant to such rotations 
and to translations and inversions. The situation with respect to 
analytic approximations depends of course on the nature of the basis 


sets • 
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Another interesting U having the structure ( 3 )‘ is that of a 
gauge transformation.^ Thus we have the result that UHF is gauge in- 
variant and further, since the appropriate is spin independent 

we can assert that OHF is also gauge invariant^with again the. prop- 
er ties of ADHF etc. depending on the nature of the basis sa’t/a 

. As a final example of a. unitary transformation- consider the trans- 
formation from the coordinate representation to the momentum representa- 
tion, Tn the abstract operator approach we have been using if H • 

rv ^ 

where is the Fourier transform of VJJ . (Here- and in what 

follows we will suppress spin labels ’unless needed). That is i‘f in 
momentum space we continue to use the symbol 't to denote the inde-. 
pendent variable, then momentum is represented by and 6-0&W.0 

by 3 . Thus for invariance the set of 

V 

the same as the set of , Since here too U is a product of 

spin independent transformations of each particle separatel5^iit. follows that 

the UHF and OHF are invariant, i.e. the Fourier transformation of o. 

Slater determinant; is a Slater determinant. Also since orbital 

angular momentum is symmetric in coordinate and momentum it. .should .be 

no surprise that the Fourier transform of a. spherical harmonic is ,.a 

spherical harmonic and theVefore that most RHF approximations for atoms 

2 

are invariant to transformation to momentum space. 

In many interesting cases a transformation U o.f the dynamical 
variables is equivalent to a change in some. para'meter!(s) in. the 


Hamiltonian: 
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The- eigenvalues of H are therefore invariant to the transformation 
'—=» 'f-V.o-) ,-and evidently H is invariant to the combined trans- 
formations Vj • and — ^<7*' . We will now show that the 

will be invariant to the. transformation O' — s if the 

set of trial functions is * invariant to the combined trans- 
formations f ‘Vj'^ and — ?> 0“ • ‘ ' 

Proof: 


which from (2) can be witten as 

E CsrO -i. Cu u . 

Therefore i^ the set U is r ' ^ the same as 'the set , 

'the” '■ will be the same, as the E 'which 'proves the' 

point, in particular if the set is ' - independent of 'then 

the ' will be invariant to o~-» 4io"P if the set of tria:l 

"t” . 

functions -xs invariant to the transformation U . 

The special case ,U = 1 is also of interest.' Thus suppose that 
H-. is invariant to some transformation ' , Then we have the 

result -that if the set of trial functions is c ' ' • invariant to 

the^ransformationj then the "E will ; '• be invariant. 


XV. SYMMETRY 

I-f H commutes with a unitary operator U' 'then there 'are certain 
consequences which we now want to discuss. However first we must settle 
a point of notation: If H commutes with a Hermitian -operator T then 
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it commutes witb -the whole -set of unitary operators where 

Of, is a real parameter, and conversely if it commutes with the set 
it commutes with T . In such a case ' we will use the s 3 nnhol 

u to represent the whole set, so that for example the' statement 
that a- wave function is an eigenfunction of TJ will therefore also 
mean that it is an eigenfunction of T 

I'f H commutes with U then if an eigenvalue of H is nondegen- 
erate, the corresponding eigenfunction is automatically an eigenfunction 
of ‘ U while if the eigenvalue is degenerate one can find a set of 
functions which are simultaneous eigenfunctions of H and U and such 
that an 'arbitrary degenerate eigenfunction i's some linear 'combination' 
of the members of this set. 

We will now show that if H commutes with U, and if the set of 
trial functions is invariant to U , then if S is non- 

•degenerate, the corresponding 4^ is ah eigenfunction of IT. ' 'The 

proof follows a pattern similar to that in 'Sec-. XIII. We observe that 
if H commutes with U then 










Thus both ^ 
both are 


and ^ 'V yield the same energy. Therefore since 
' in the set it follows that if Qi is nondegen- 


erate, then 'V and must be proportional which is what we 

• > 

want to prove. ^ — \ 

We now turn to the degenerate case.. Eigenvalues of H are degen- 


erate because there exist U's which commute with H but not with each 
other. Such degeneracies however can always be' removed by applying 
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suitable extra external fields so that in the fields all the U's which 
commute with H do commute with one another. in addition such 

external fields serve to remove the degeneracy in the variational cal- 
culation, then we know from the discussion of the previous paragraph 
that the in the fields will automatically be eigenfunctions of 

those u’s which commute with H in the fields, and which leave the 
set invariant. If now we let the fields tend to zero, but 

continue to use the same set of trial functions, it follows that when the 

A. 

external fields have been reduced to zero the'M^ will still be eigenfunctions 
of these U's (we are of course assuming that the-’U's are . independant of the 
external fields). Since different external fields single out different 
U's we therefore have the result that ’ _ ‘ if 


r ^ 

there is degeneracy then, among the degenerate 
functions of any set of U's which commute with ^ 
other, and which leave the set of trial functions 


, will be eigen- 
and with each . 

invariant . 


However whether or not an arbitrary degenerate '4' can be written as 

A 

a linear combination of the degenerate H' which are also eigenfunctions 
of a particular set of U's will depend on how much linearity there is 
in the set of trial functions. 

The above discussion, though quite general, contains the qualifi- 
cation that all degeneracies in the variational calculation should be 
removable if one would only apply suitable external fields. Therefore 
its applicability to actual calculations is not immediately evident . 
Nevertheless it is clearly consistent with the results discussed in 
Section X for atomic UHF. Thus ^ ^ and 6- 


and the U 
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for •parity are all products of single .particle operations and therefore 


leave the set of '^r' invariant while -6- 


and are 


not. Therefore the present discussion xrould correctly suggest that 
for a given ^ , UHF i can be found x^hich are eigenfunctions 

of parity, of a component of L and of a component of S , but 

k ^ 

that in general one xd.ll not find eigenfunctions of either V~ or 
S . Also consistent with this point of view is the fact that in 

L 'i- c, 

_ and ^ 

this is usually forced by the behavior with respect to L t and 
Tlius for example a closed shell state is a simultaneous eigenfunction 
of all components of ^ and ^ with eigenvalues zero and there- 
fore must be an eigenfunction of ^ and ^ x^ith eigenvalues 
zero. Also the function mentioned before . (X-3.) is a quartet function 
"because" it has the maximum * possible for the given N . 

Less problematical but more specialized is the following theorem 
which is similar to one in Sec. XIII (a special case of this theorem 
-was discussed in footnote A, Sec. VII): one can find which 

are eigenfunctions of U ^and such that .an arbitrary degenerate 
is some linear combination of these eigenfunctions of , if the 

set of trial functions Is ■ invariant to U in such a way 

that U Induces changes only in linear parameters and/or functions.; 
that is if U H' involves the same non linear quantities as S:' , 

but possibly different linear ones. The proof follows from the fact 
that under these conditions the \ I appropriate to the linear part 
of the calculation xdll commute with U and therefore so will H . 
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The pattern of the proof is then identical to the proof. of the analo- 
gous theorem in Sec. XIII. 

Also, again analogously to the discussion in Sec. XIII, whether 
or not the functions 0 'V — all of which yield the 

same energy, are all in the set, one can in- any case pro- 

duce an_o 2 timal set of functions which are eigenfunctions of U , by 
doing a linear variation calculation in the space' spanned by these 
function's / ■ ( U here could represent a complete commuting set^ of 
operators') « That this will produce eigenfunctions of U is guaranteed 
by the- fact that this- linear space' is obviously invariant to the action 
of, U , and therefore the M appropriate to it will commute with 
U. Indeed it is easy to see what the result of this calculation will 
be. Namely we can expand '4' in normalized eigenfunctions of U 
belonging to different eigenvalues, thus 

T1 (XV-1) 


where the sum may be infinite, though hopefully it is only finite. Then 
since U and ^ S' are simply linear combinations of these 

same functions with different coefficients', it is clear that the 
functions '^'h. involved, in the sum span the linear space formed 

^ ft 

from . Further' since 


C5>u’j TtO") 




(XV-2) 


it follow from the "converse theorem" (V) of Sec. VI that the 
A 

will -be the 'A'k which would result from the linear variation calcu- 
lation. Thus instead of doing the linear variation calculation we can 
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A 

simply project out. of the various symmetry components which 

it contains. This procedure, and approximations to it have been 
extensively applied to UHF functions, particularly to produce functions 
of a definite total spin ^ (this is a case in which '-''r; ^ don't 

belong to the original set since, as noted earlier is not a one 

particle operator). Of course in general one can do even better if, as 
mentioned in Sec. VIII, one projects before carrying out the original 
variation calculation. 

There are certain, similarities between the procedure we have just 
; discussed and that in Sec. XIII. However there are also certain dif- 
ferences and we would now like to draw attention to these in more 
detail. If we write ^ in the notation of Sec. XIII 

then we have written ^ as a linear combination of (unnormalized) 

eigenfunctions of the complex conjugation operator K belonging to dif- 
ferent eigenvalues: 


So here we have a certain similarity with (1) . However as we remarked 
in Sec. XIII, 7-| and are in general not the functions which 

result from the further linear variation calculation and this is to 
be contrasted with the result of the present section, that the 
are the functions which result from a further linear variation calcula- 
tion. The difference arises because while U' is a -linear operator of 
the familiar sort, K is not. Rather it is what is called anti« linear; 
iCotV - ^ f-'t if oi is a number, and thus is neither 

Hermitian nor unitary. Rather one has 



In particular then, from =• and =-^ Cv^X-^ 

one concludes, not that and v'lk- are orthogonal but only 

that is pure imaginary, which is anyway obvious: Thus 

following the canonical pattern 


but also 

U.C'V^) = ZTn^)^-=. Z%J)^ 

which completes the proof. Note also that although and t%v- 

belong to different eigenvalues of K ", TC.j and Yv- belong 

to the same eigenvalue. Similarly from the fact that can 

not conclude that H — D ^ Finally let us note that in 

a certain sense and are not very well defined, since, 

if o4 is a number, Mr’ and 'r 2. g, '4' are physically equi- 
valent yet and are quite different from and 


XVI. GENERALIZED HELLMANN - FEYNMAN THEOREM 

Suppose. that H contains a real parameter . Then by 

differentiating 

Cs^) tw- S') 4) 

with respect to CT^ we find 

^ ^ fXV' 

where in carrying out the differentiations we have of course kept 


the integration variables fixed. In this connection it should be 
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especially noted that if one changes variable then in general 
will change if the change of variable is 0“ dependent. Also^'we 
have assumed that the volume element in the integration does not 
depend on CT' . For many cases of Interest in which the volume 
element does depend on ^ the dependence is only multiplicative 
and hence cancels out.^ 

We will now show that if the set of trial functions is invariant 
to changes in the value of <J~ then the first line of Eq, (1) will 
be separately equal to' zero so that we will have 



li<r 





(xyi-2) 


which is the variational. version of the generalized Hellmann-Feynman 
theorem for . The proof is as follows : 

Though as a whole the set of trial functions is, by assumption, 
independent of , still which particular members of the set are 

selected by the variational method as optimal trial functions will in 
general depend on the value of Therefore let ~ 

be an optimal trial function when , and let 

be the corresponding optimal trial function for a slightly different 
value of CT“ (we are' assuming that the dependence on CJ^ is 
continuous). Now by our assumption and both 

belong to the set from which was selected. Therefore 

was one of the neighboring functions which was examined 
in testing for the stationarity of . This in turn implies 


that 
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^ Scr, 

must be a 5 ^ satisfying (IV-3) ; that is, cancelling the factor 
we must have 

-bc^i ^ ^ y 

which, since CT^ could be any value of ^ , proves the point. 

The theorem which we have just proven, is in its essentials due to 
Hurley, and we would like to emphasize its simplicity and its generality 
since this does not seem to be widely enough appreciated. In particular 
there are in the literature (subsequent to Hurley’s work) many very 
detailed derivations of special Hellmann-Feynman theorems (i.e, special 
choices of ) for particular variational approximations; deriva- 

I 

tions which are quite unnecessary since the results are immediate 
consequences of Hurley’s theorem. 

Obvious examples of sets of trial functions which are invariant 
to changes in cr^ are (i) the trial functions of most SCF (UHF, 

OHF, restricted AF etc. etc) t 3 rpe approximations since there are 
usually.no a priori requirements as to how the spin orbitals should 
depend on possible ^ like nuclear charge, nuclear configuration, 
strength of external fields, etc. (ii) the trial functions of analytic 
SCF approximations if the basis set is independent of cr* , (iii) 
linear spaces in which the basis .set is independent of <3" 

In the latter two cases, if the basis set is fixed, invariance 
requires either that the individual basis functions don’t involve ^ 
at all or more generally that df^Lcr-^A-cr') is a linear combination of the 
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• ' However if the basis functions also involve non linear 
parameters (thus not- really a linear space) then more flexibility is 
possible. 

As .an application of the generalized Hellmann-Feynman theorem we 
will use it to derive, and extend, some of the results found in Sec, XI. 
In the spirit of double perturbation theory let us include in the 
"Hamiltonian!' an. additive term of the form ^ ^ where • is a 

real- parameter and whejre we' will be interested in the limityU, -^0 
If the set of trial functions is invariant to changes in then 


we will have 




(XVI-3) 


Now suppose that in the limit, ,. where iT is- some other 

parameter, becomes an eigenfunction of the "Hamiltonian"; Then 

A 

barring problems with degeneracy the error in must be of order 

iCW f /^) where ^ , is some constant, and therefore the error in 

^ will be of order . But now if 'V is the eigen- 
function to which approximates, and if E is the corresponding 


eigenvalue then 




(XVI-4) 


, Comparing (3) and (4) we see that the error in. the expectation value 


of & 


of order 


is the same as the ei?ror in 


in the limit 




which from, the .above j.s 


This result then evidently includes that of Sec. XI for UHF as a 


special case if we identify 


with l/^^. Also for OHF we see that. 
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with the further qualification that 0^ ^ and Ir] ^ be spin indepen~ 
deut, we have a- similar result since in the limit the' "Hamiltonian" 
^wili be a spin free one'-elgg^i^r^iSg^^or and hence will-' 
have the ' of the OHF as eigenfunctions.^ 


XVII. mTERVIRIAL THEOREMS - GENERAL 


Let be a Hermitian operator and suppose that among the- - 

variations of . ^ which are possible within the set- of trial functions 


xs 


u S 


(XVII-1) 

where is a small real parameter. Then (IV-3) must be satisfied 

with o'V given by (1). Thus we have 

CCg»c^ ^ 4“; ^ C t> . ,(XVII:-2) 

which immediately simplifies to 

C'-p-) AjH) ^ —0 (XVII-3) 

and we have the result that under the given conditions,, satisfies 

the hypervirial theorem for 

We now note that (1) is the term of first order in in the 

expression 

A A 

'f — ^ ‘ • (xV.II-4) 


A 


From this observation it then follows that a sufficient condition for 

O’ >3 

to satisfy the hypervirial theorem for -/-J is that the set 
of trial functions be invariant with respect to all unitary transforma- 
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tions ^ ^ 


where is an arbitrary real number. Eroof: 


If the set is invariant to such transformations then the first term on 
the left hand side of (4) will be a function near to *4^ in the set, 
and therefore . (1) will be a possible variation of within the 

set, whence the result follows. 

It should be clear that this condition is only sufficient and not 
necessary. Thus we really need only that (1) be a possible variation 
for one real value of . Also under our conditions ^ H' 

r\J 

is .a possible variation of any within the set, whereas we need 

A 

this to be true only for 4” 

As an application of these results let us consider UHF. If ^ ■ 
is a one particle operator 

^ = J. . 


then' 


takes the form 'U-\5>) 


Thus it follows from the 


discussion in' Sec. XII-B that UHF satisfies the hypervirial- theorem for 
any one-particle (Hermitian or not, since the hypervirial theorem 

is linear in ^ and any one-particle operator O' can be written 
as a linear combination of two one-particle Hermitian operators, for 
example and ). Correspondingly weaker 

statements hold for restricted HF schemes. Thus consider OHF. Then 
clearly we have the result that the optimal trial functions will satisfy 
(3) for any spin independent one electron operator 

In our discussion so far we have insisted that be real. • How- 


ever UHF is formally invariant to transformation by 2- 




whether 


or not ot is real so long as is a onevparticle operator. 
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Returning then to (2) and assuming to be pure imaginary and 

Hermitian one then finds that the of DHF will also satisfy 

(XVII-5) 


or, combining (3) and (5) 

L.vf, -=^ (XVII-6) 

Similar results of course also apply to OHF with any spin inde- 

pendent Hermitian one-particle operator. Since however., non unitary- 
transformations are often not pleasant to deal .with (they may transform 
a normalizable function into an unnormalizable one) , the following 
derivation of (6) for UHF and OHF based directly on the variation 
method may be more convincing. 

With a Hermitian onev-particle operator it follot-7s from the 

consideration of Sec. XI that for UHF (and .for OHF if is- -also 

spin independent) that if we write 

^ 4 - (S) 

then ® satisfies the conditions of Brillouin's theorem. Thus we 
have 

= C4, 

which is the right hand side equality in (6) . The left hand side 
equality then follows by complex conjugation. 

Turning now to situations in which the set of trial functions forms 
a linear space let us return to (1) and write it as 



CtV ZSo.Ji') vf- — 4 
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’(XVII-7) 


Then we see that if ^ applied to ^ yields a function in the 
spac^^j-ihen (1), with no restriction on , will certainly be a 

possible variation of within the space since, by linearity, the 

first term in (7) is a neighboring function in the space. Therefore , 
as a sufficient condition we can say that if the set of trial functions 
forms a linear space then (6) and hence (3) , will be satisfied if ..M 
applied to any function in the space yields a function in the space. 

Note however that this sufficient condition, though started a bit differ- 
ently, is formally equivalent to our general sufficient condition applied 


to this case: first of all it implies that 0 with an 

arbitrary complex number applied to any function in the space yields a 
function in the space, since if ^ T is in the space so is M 

etc. and therefore by linearity so is . Conversely if 0 'r. 


is in the space then -by linearity so is 




s<~4 6 


The following direct .derivation of (11) for the linear case is also 




of interest:. Since is an eigenfunction of H we have 


But TC and if is in the space 

so that we can replace Vi’ by H which yields (11) . 

We have now seen two distinct consequences of the invariance of a 
set of trial functions to unitary transformations (i) invariance of 
the optimal energies, and (ii) hypervirlal theorems. We now want to 
bring these two results together with the help of the generalized 
Hellmann -Feynman theorem. Introducing a real parameter we define 
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H CX) 3 . M 


with and'S^-"^) the corresponding optimal trial functions 

and energies for . Now suppose that the set of trial functions 

is invariant to transformation by S-'' for all . Then 

/S 

from the results of Sec.- XII-B it follows that the S are in fact 

independent .of ' . Further since changing 's to simply 

replaces one transformation by another we see that the set of trial 

functions is invariant to changes in 'S . Therefore from Sec. XII-D 

A 

and the result just found, that £ doesn't depend on , we 

have ' that 




However 


'63 


'^^hus, putting "3^0 equal to zero, we have rederived the result 
that 4^ satisfies the hypervirial theorem for ^ 

We will now consider some specific hypervirial theorems of physical 
interest. Jor H we will always use the non-relativistic fixed nucleus 
electrostatic (i.e. no magnetic fields) Hamiltonian. Also we will work 
almost entirely in the coordinate representation. 


XVIII. MOMENTUM THEOREMS 


Let 




be a component of 




nJ 


(XVIII-1) 
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Then one readily finds that ^ is the corres- 

. ^ ^ ^PSK 

ponding component .of "P the operator for total electronic momentum 


^ ^ i -fs (X?III-2) 

Thus if the hypervirial theorem for this is satisfied then the 

'— 7 ’ 

average of the corresponding component of p calculated using 'P 

will vanish. We will call this a momentum theorem. 

Since the components of '0 are spinless one electron operators 

we then know that the of UHF and of OHF will satisfy all momentum 

theorems. One way to produce a set which is invariant to the trans- 

formations generated by 0 is to explicitly introduce real 

A 

variational parameters K according to 






where the 0> are independent of K . That this works then follows . 
from the observation that such a set of trial functions will be in- 


variant to the unitary transformation 




-=^ _ 


?)‘P rv 


(3 = -e' 


since ^ 


However having made these remarks it is very important to point 
out' that often the momentum theorems will be satisfied simply for 
reasons of symmetry of one ^kin'd or another . For example since ^ is 
a pure Imaginary Hermitian operator its average will automatically 

'7 

vanish if is real. Proof; 

F -J-) - '^4* ■?* 4*)= - ^4, f 4) 
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therefore CvV) ^ S'} "= O » More generally if H is real the hypervirial 

theorem for any real Hermitian will be satisfied if "S' is real 

since t V3 } is then a pure imaginary Hermitian operator.^ 

Another example is provided by an isolated atom. Then, reality 

A 

aside, if, as is generally the case, has a definite parity under 




inversion through the nucleus then 

v-x . — ^ -V 

while £ obviously changes sign. Therefore C'f'j ? ) will 


is invariant to inversion 

A ^ ■ 


vanish. Or consider a diatomic molecule (or an atom in field which is 

A 

invariant to rotation about an axis through the nucleus) . If 
has a definite component of angular momentum along the internuclear 
axis, then 4-^ will be invariant to a rotation of 180“ about the 
internuclear .axis while the components of p perpendicular to the 
axis will change sign. Therefore averages of these components will 
automatically vanish. Also they will vanish if the molecule, instead 
of having a definite component of angular .momentum, has a definite 
parity for reflection in any plane containing the internuclear axis. 

The vanishing of the component along the internuclear axis is then 
guaranteed (by an adaptation of the earlier reality argument) if, as is 

often the case, is complex only because it contains an angular 

■* 

factor on which the component of p along the axis does not act. 

Also for a homonuclear diatomic molecule one will usually arrange for 
the Mr' to have a definite parity with respect to reflection in a 
plane perpendicular to the .axis and through the mid-point and this will 
also ensure the variation of the average momentum along the internuclear 


axis . 
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XIX. FORCE THEOREMS 


Let 




be a component of the operator for total electronic 




momentum ip . Then one readily finds that ttH 1^— 






is the corresponding comj)onent of P ^ 'the, operator for the total 


force on the electrons. F is of course the sum of the forces 
due to the nucleus and the forces due to whatever external fields may 
be present, the electron-electron forces cancelling . Thus for a mole- 
cule in the absence of external fields 

A- 


5 _ z r ■2’^ c ra --^0 

V- — .e , ^ 


(XIX-1) 


where ^ is the charge on the A’th nucleus and P-A. is its 

position vector. (Incidently the reader should keep in mind that this 
result and various others which we have been and are stating in the 
language of atoms and molecules and solids, are of course either quite 
general and do not depend on the detailed nature of the force laws or 
can be easily generalized) . Thus if- the hypervirial theorem for this 
is satisfied then the average of the corresponding component 
of p calculated using 'V will vanish. This we will call a force 


theorem. 


Zf 


Since the components of v are spinless one electron operators we 

then know that the Mr' of TJHF and of OHF satisfy all force theorems. 

— y 

Further since a component of p generates a rigid translation of thfe 

electrons in the corresponding direction wfe know that in general if thfe 

set of trial functions is invariant to rigid translation in a particular 

direction, then the corresponding force theorem will be satisfied by 

A. 

the 'V 
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One way to produce a set of trial functions which is invariant to 


all translations is to explicitly introduce real .variational parameters 
attached in an additive way to each electron coordinate: 




(XIX-2) 




may of course involve other variation parameters and/or functions 
and may also depend on non variational parameters like nuclear coordinates, 
charges, etc. However here, and in analogous situations later, we will 
not indicate such dependencies explicitly unless they are relevant to 
the discussion (we have followed the same policy all along with respect 
to spin coordinates) . That this works then follows from the observa- 
tion that a rigid translation ig equivalent to 

"L +" o- i.e. simply produces another member of the set. 

Indeed (2) formally has the same structure as (XVIII -3) since evidently 
we can write (2) as 


' ^ 

^ /V* 'I * /"N/ ^ .rv ^ 

'4' ^ ’^l"" ^ ^ 0 O, 


(XJX-3) 


Also (XVIII -3) transformed to momentum space takes the. form 


Cs-f 

where '% is the Fourier transform of Q> , i.e. X> generates 

translations in momentum space. 

* ^ 

However, just as with 'P , this elaborate machinery may be un- 

necessary in- that force theorems can often be satisfied simply by reason 

^ 

of symmetry. Thus consider again an isolated atom. Since p is 

odd under inversion through the nucleus, ^ will vanish 

A 

if 'V has a definite parity under inversion through the nucleus-. 
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Also for a diatomic molecule (or an atom in an external field that is 

axially symmetric) force theorems perpendicular to the internuclear 

axis can be satisfied under the. same conditions as the corresponding 

momentum theorems of the preceding section. Also this same symmetry 

will insure that the average net force on each nucleus separately will 

< 

have only an- axial component. However along the axis 3 symmetry is 
usually of no help except in the case of a homonuclear diatomic 
molecule. Hence to satisfy the force theorem along the axis usually 
requires the use of a set of trial functions which is explicitly 
invariant to translations along the axis. 

As an interesting application of the force theorems, consider a 
molecule, first in an external electric field. The net, force on all 
the nuclei is then the sum of the forces on the nuclei due to the elec- 
trons, and the forces on the nuclei due to the external field, the 
nucleus-nucleus forces cancelling. Thus 

Net force on nuclei = Force on nuclei due to electrons + 

(XlX-4) 

Force on nuclei due to external field 
On the other hand the net force on the electrons is the sum of the 
forces on the electrons due to the nuclei plus the forces on- the electrons 
due to the external field, and if the force theorems are satisfied these 
two contributions cancel. Thus 

0 = Force on electron due to nuclei + 

(XIX-5) 

Force on electron due to external field 

However 

Force on electrons due to nuclei = - Force on nuclei 


due to electrons 


(XIX-6) 
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Therefore combining (4), (5), and (6) we have the result 

Net force on nuclei “ Force on nuclei due to external field + 

Force on electrons due to external field (XIX-7) 

from which we can draw several interesting conclusions. 

(i) Suppose that the molecule contains only one nucleus, i.e. is 
an atom. Then if we are dealing with a uniform external electric field 
0 ^ the force on the nucleus due to the external field is simply 
while the force on the electrons due to the field is 

. Therefore if the force theorems are sat- 


ini 




isfied so that (18) applies we have the 


independent result 
W \ — > 


Net force on nucleus = - A) )£. - ^ (XIX-8) 

N/Z is called the dipole shielding factor and evidently N/Z is its 
exact value. ^ 

(ii) If for an isolated atom the force theorems are satisfied 

A 

then the force on the nucleus calculated' from , will vanish. 

(iii) Returning to molecules, if there is no external field, 
then if the force theorems are satisfied so that (7) applies, we have 
that 


Net force on nuclei = 0 (XV'X-^^ 

In particular then, in a diatomic molecule the force on one nucleus 
A 

calculated from Vp will be equal and opposite to the force on 
the other. 


XX. TORQUE THEOREMS 

® component of L the operator for the total 
electronic orbital angular momentum 



Then one readily finds that t is the corresponding com- 

ponent of the operator for the net torque on the electrons, this net 
torque being provided by the nuclei and whatever external fields may 
be present, the electron-electron contributions cancelling. Thus if 
the hypervirial theorem for this is satisfied then the average 


A 

of the corresponding component of the net torque calculated using 

will vanish. This we will call a torque theorem. 

Since the components of ^ are spinless one-electron operators 

A. 

we then know that the of TJHF and of OHF satisfy all torque 

theorems. Further since a component of i- generates a rigid rotation 

of the electrons about the corresponding axis we know that in general 

if the set of trial functions is invariant to rigid rotations about a 

particular axis then the corresponding torque theorem will be satisfied 
A 

by the "V 

The angular momentum and torque which we have been talking about 
are calculated about the origin of coordinate system. If more generally 
we calculate the angular momentum and torque about another point we 

4 ? 

will usually get a different answer. That is if we replace the 
by O' then the average angular momentum changes by 

while the average torque changes by 

5: < cv£, p 

Therefore only if the momentum theorems are satisfied is the average 
angular momentum independent of origin, and only if the force theorems 
are satisfied are the average torques independent of origin. 
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We now note . that for an isolated atom the operator for the torque 
about the nucleus vanishes identically (’ L; about the nucleus is con-- 
served) so that if the force theorems are satisfied all torque theorems 
about an arbitrary origin will be satisfied. Also for a diatomic mole- 
cule the torque theorem for torques about one of the nuclei will usually 
be satisfied simply by symmetry, and therefore if the force theorems 
are satisfied, the torque theorems about an arbitrary axis will be 
satisfied. Proof; About nucleus 1 the net torque on the electrons is 
due only to the other nucleus, the torque operator being proportional 


3?= f cn-PO X 

s-l 


z ^ cJr 2 '3'X C B,' - K) 


■t** IT* 


NOW the components of VT which are perpendicular to*-P-i change 

sign if we rotate by 180® about the internuclear axis 
A 

Therefore if Hi' has a definite component of angular momentum along 
the internuclear axis (or a definite parity for reflection in any 
plane containing the internuclear axis) it follows that the average 

of these components of vT will vanish. Therefore we may effectively 

— ^ .-?> 
replace vy by its component along whence 3C is ef- 

fectively 2 ero, which completes the proof. 


XXI. VIRIAL THEOREMS 


be the operator 


<sy ^ Z 


a- 


(XXI-l) 



95 


the addition of ensuring that is Hermitian. However it 


finds that ^ 


clearly plays no role in the hypervirial theorem for ^ 

equals 

Proof: -«"=■» 4^ can also be defined by 


where ^C~ ^ 


t 'io- 


ia/Wjuv a--s.t) 


Then one 


(XXI-2) 


(XXI-3) 


It is then easy to verify that satisfies (3) . 

Thus produces a positive scaling of the electronic coordin- 

atesj the factor of ^ which arises from the ensuring 

that the normalization of the scaled function is the same as that of 
't , as befits a unitary transformation. Since hox<rever this factor 
will cancel out in calculating energies we therefore have the result 
that if the set of trial functions is effectively invariant to such 
scaling then the will satisfy the hypervirial theorem for 

(we will discuss the content of this theorem in a moment) . In particular 
since T is a spinless one-electron operator the '<{> of UHF and 
of OHF will satisfy the theorem. One common way to ensure' that a set 
of trial functions x<rill be invariant to positive scaling is to explicitly 
include a coordinate scaling parameter as a variational parameter. That 


is one uses trial functions of the form 


q- * S g C"5 -f, "• -5 -(«3 


(XXI-4) 




' S» 

with *5 a real variational parameter, the factor of S being 

optional. That such a set of invariant to positive scaling then follows 

from the observation that replacing the *^6 by in '-V' is 
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equivalent to leaving th.e 'ts alone and replacing '5 by and 

therefore effectively produces another member of the set. Moreover such 
a set is obviously invariant not only to positive scaling but also 
negative scaling^^'^®} and hence in particular to inversion . 

Evidently TJHF and OHF also have this property. In general when a set 
is invariant to both positive and negative scaling we will say simply 
that it is invariant to scaling. 

The hypervirial theorem for is essentially the virial theorem 

which is so often used in discussions of chemical binding, force con- 
stants, etc. To show this we first note that we can calculate 
t quite generally as 




^ fs 


it 


(XXI-5) 


Now let us specialize to an isolated molecule. Then 


H = T + V 


(XXI-6) 


where T , the kinetic energy operator, is a homogeneous function of 

degree 2 in the , while V , the potential energy operator, which 

A 

we will take to include the nuclear repulsions so the e is the total 




molecular energy, is a homogeneous function of degree (-1) in the 


and the 
we find 

A? 

n 


Thus from Euler's theorem on homogeneous functions 


few , {2 » 2-T 


(XXI-7) 


and 

J S.. = -V - 2 . feft 

5, ft 


(XXI-S) 



97 


Therefore the hypervirial theorem for 

•2-t V 0 - 7 s; -o 


becomes 


(XXI-9) 


A A 

where 'V is the average kinetic energy of the electrons, V the 

A 

average potential energy of the electrons and nuclei, and is the 

average force on nucleus A : 

. V - (f'i ~ (XXI-10) 

where 

- • 

s-.- mi 

We will call (9) the generalized virial theorem. To reduce it to more 
familiar form suppose, that *4^ , in addition to satisfying the 

hypervirial theorem for , also satisfies the generalized Hellmann- 


Feynman theorems in 


coordinates for CT equal to the 


components of the . That is suppose that 

Then (9) yields 

2:Tt-V 4-^ ■=-‘0 


(XXI-12) 


(XXI-13) 


Finally let us suppose that was derived from a set of trial 


functions which is 


invariant to translations and 


rotations of the electrons. Then since a translation or rotation of the 
electrons is equivalent, as far as H is concerned, to leaving the 
electrons alone and translating and rotating the nuclei the other way, 

and since our set is already assumed to be independent of the in 

_ 

order to satisfy the generalized Hellmann-Feynman theorems for 
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it follows from Sec. XIV that ^ can depend only on translationally 
and rotationally invariant quantities like bond lengths 


Since the latter are homo- 


and bond angles Cr ,4 - ^ 

geneous functions of degree zero in the and since 

one then readily finds that under these conditions (13) can be written 
13 


as 


4- V ^ ^ 

IrRii-) 

For a diatomic molecule this becomes the familiar 
2-^ !h C 4- (d, ^ 


(XXI-14) 


(XXI -15) 


where R is the internuclear separation, and for an isolated atom 
(R = 0) is the equally familar 


V ^ 0 


(XXI-16) 


Eq. (14), or its specializations (15) and (1§) , is what is usually 
called the virial theorem.*^ 

Thus if one uses a set of trial functions which is invariant to 
positive scaling of the electronic coordinates, is invariant to changes 
in the nuclear coordinates, and is invariant to translations and rota- 
tions of the 'vfr , then the '•Y will satisfy (9), (13), and 
(14). In addition we know from previous sections that these conditions 
will also guarantee that the net force on the electrons and the.net 
torque on the electrons calculated from 4^ will both vanish, and 
that therefore the sum of the forces on all the nuclei as' calculated 

A 

from H' will vanish. Also the forces on the nuclei in a diatomic 

A 

molecule- will be only along the- axis since'if* ' is a function 

only of Li? 1 


then 
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(XXI-17) 


Tr 

This then will also mean that the average net torque on the nuclei will 
vanish 

As we have, seen UHF and OHF satisfy all these conditions. For 
other sorts of trial functions one often explicitly introduces variation 
parameters to do the job. Since there are various ways of doing this 
it will help to avoid notational confusion to consider a simple example - 
a one-electron diatomic molecule in a simple LCAO type approximation 
involving two Is atomic .orbitals. Generalizations should be obvious. 

A first choice (Heitler - London) for the set of 4^ might be the 
functions 




-'X -vf- 


V'-- ■•■T A^C 6 


(XXI-18) 


where 


cy 


is a variational parameter. However this set has hone of 


the properties we want. It is not invariant to scaling, 'it depends 
explicitly on and and is not invariant to either rotation 

or translation of the electron’s coordinates. To take care of all these 
deficiencies, but still keep the same sort of 'V , it is then quite 


natural to introduce two real positive variational parameters 

s ■' ■ ■ 

and 2 


^ and two real vector variational parameters ^ and 


and use the set of trial functions 


(XXI-19) 


This set is then obviously invariant to changes in 'and 'p.T.. since 

each member ’of the set is separately invariant. Also it i's obviously 
invariant to scaling. Further the set is invariant to translations-’ 
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since replacing 4 by is equivalent to replacing ^ 

Qi 

and ^ hy ^ 5 ^ and S o, , Finally the set is invariant 

to rotation since replacing ^ by where CL is a rotation 

dyadic is equivalent to replacing and ^ by V and 

GsT' . Proof: i Cl'S Y I ® ? 

since the • length of a vector, in this case,^jjj4:^s|;^.«wi5^'r"' ^ ' - 

is invariant to rotation. (As we said, we inr'otd'dbwn (19) "by analogy' " 

with' •(18)';' However it should be noted that either term in the' sum (19) 

would'yield a set with the- same invariance properties.) 

Thus far our discussion has been quite general, with a view toward 

application to a general polyatomic molecule. However for atoms and 

diatomic molecules for example, one has special sjmimetries which are 

usually taken advantage of in any variational calculation. First let 

us consider an atom. Then (14) becomes 

A 

A ^ I? 

2 -T ■ (xxi-20) 

which is equivalent to the virial theorem if the force theorems are 
satisfied, and we have seen that this will usually be the case for 
reasons of symmetry. However we can also derive the virial theorem from 
(20) simply by using the nucleus as the origin of coordinates so that 
; and this one would: almost certainly do in any practical 
calculation. Thus if one chooses the nucleus as the coordinate origin 
then the virial theorem will be guaranteed simply by having the set of 
trial functions be invariant to positive scaling of electronic -coordinates . 
Further, with this origin of coordinates we see that most restricted 
Hartree-Fock. methods for .atoms -will also satisfy the -virial theorem 
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since positive scaling does not affect angles, i.e, only radial co- 
ordinates need be scaled. 

Also for, atoms, and with the origin of coordinates at the nucleus, 

the following alternative derivation of the virial theorem as a conse- 

2 

quence of invariance to scaling is of interest. Suppose that we 
calculate using the trial function (4). Then by changing 

variables in the Integrals from the to the S 'y. and using 

the' homogeneity properties of T and V it- is easy to shox^ that, in 
obvious' notation 




"V- 




where to be definite we have assumed that "t> 


(XXI-21) 

is positive. Re- 


quiring that . 


- 'n 


then yields 


2. 5 l fO d" V 6") ■=. o 

A 


or, multiplying' through by ^ 

A 

T d- V ~ -D 

which is the virial theorem again. 

Turning now to diatomic molecules, here one 'would, almost certainly 

put the origin- of coordinates on the internuciear axis and use the - 

internuclear axis as one of the coordinate' axes , say the x' axis. Under 

these circumstances V will be a homogeneous function of degree -1 

in the and the -^ so that if we use a set of 4^ which is' 

invariant to positive scaling of -the we will -find instead- of 

/>> 

(9) j that 4^ satisfies 


(XXI-22) 
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If further the satisfy the generalised Hellraann-Feynman theorem 

for X>5 in the coordinates; 


ir- 


then (22) yields 


'2. T V V Z* X/)- 
^ /V-’ 'bij 


(XXI-23) 


(XXI-24) 


f ■ 

Eq. (24) will then be equivalent to the virial theorem if E; depends 
only on the internuclear separation R- \ Xy | . Since translation 
of the electrons along the x' axis and inversion of the electrons 
(strictly we need consider only ) is, as far as H is con- 

cerned, equivalent to leaving the electrons alone and translating and 
inverting the nuclei it follows from' Sec. XIV that if the_set.of trial 
functions^ in addition' to being invariant to changes in the ^ is 

invariant to such translations and inversions of the electrons then 
fe* will depend only on the translation and inversion invariant 
quantity R . 

Thus if the set of M'' is invariant to scaling of the' 'JV , 
is invariant to changes in the ^ and is invariant to translation 

of the along the internuclear axis then the will satisfy 

(22), (23), and (24) and (15). Also the force theorem will be satisfied 
along the internuclear axis. UHF and OHF of course, have all these 
properties. Returning to our one electron diatomic molecule, a set of 
trial functions with these properties would be ' ' '• * 


•h c 6. 
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and again, as with (19)^ either term in the sum yields a set with the 
same invariance properties. 

If (22) is satisfied one may ask whether or not the more general 
(9) is also satisfied. The Cartesian coordinates in (9) "are arbitrary; 
differing therefore from the x'y'z* coordinates by a rotation about 


the origin plus some translation. Now 
of 




is the component 




. -0 ) 


1- ^ 7 ^ 




which, if we were to delete the ' would be precisely . There- 

in readily follows that if we write 


fore since -f A ^ — 0 


C&_ 4- where CX is a rotation 


t 


a translation chan 


dyadic and 

- Xft' = fi,- 

ilKft 

Therefore we can write (22) referred to general coordinates (but with 


the same ) as 


/V A 

2-.T 






r— 9 


A ^ ~ O 


(XXI-26) 

where 4 is the vector connecting the origin of the x'y'z’ coordin- 
ates .to the origin of the x y z - coordinates . Thus if all force theorems 

are satisfied so that ^ :=-0 , (22) will imply (9) . As we have 

k 

discussed before, the force theorems perpendicular to the internuclear 
axis will usually be satisfied by symmetry, and 'for example such is the 
case with the functions (25) since they are individually invarianc to 
rotation about the internuclear axis and therefore have zero angular 
momentum about the internuclear axis . Therefore if cJ is perpendicular 
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to the internuclear axis C22) and (9) can be made equivalent simply 
by s3mHnetry. If Ji has a component along the internuclear axis 
then one also needs to explicitly satisfy the force theorem along the 
axis in order to have equivalence. 

Returning to the x'‘yl coordinates, often they are chosen in 
such a way that whatever the nuclear separation, the coordinate origin 
is a fixed fractional distance along the internuclear axis, thus at a 


point 


7 


^ ^ o<> R' ) 4- R-g - 

CxC.^ ^ 

Introducing coordinates referred to this origin 
we find 

^ - -ii 1-i 


where 


is the vector separation of the nuclei 


I? - (XX3 

Thus since ^ has only one cojnponent, call it (>^ , in these 
coordinates we see that V is a homogeneous function of degree -1 
in the and and therefore if the set of is 


invariant to positive scaling of the 


we will have 


V 'V- 


^ '-'^1 II 


where 
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axial component of force on nucleus 1) 

4"|i> 

^j-tA (average axial component of force on nucleus ^ 

(XXI-30) 

If further the generalized Hellmann-Feynman theorem for d"3i. ()^ ^ 
coordinates is satisfied 


then we will ’have 


hS Csf, » -£■)/, A A, 


2- V +• (j^ ^ ^ 

■ ^6^ 


(XXI-Sl) 


(XXI“32) 


This equation will then be equivalent to the virial theorem if- 
depentis only on |2.-=. i(5?\ . Since as far as H is concerned 

is equivalent tc ^ (strictly all we need 

is Xi,'' — ^ ) it follows froit the discussion in Sec. XIV that 

if the set of trial functions (which we have assumed to be independent 
of (R ) is also invariant to negative scaling E will depend 
only on R and so we t-tIII have' the. virial theorem. Mowever these 
devices alone j invariance, to scaling of 'tj, and invariance in 
coordinates to changes in (R. > will mot in general ensure that 

either (22), (23) or the axial force theorem are -satisfied. Of course 
if the axial force theorem is satisfied then, -from (42) and (43) and 
assuming that Ol'v C ^ we will’ have 

-^’^/cSR = average axial component of force on nucleus 2 

a ” average axial component of force on nucleus 1. 
If R is ■< 0 the signs on the right are reversed. 
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Along these same lines it is of interest to note that Lowdin 
has given a simple prescription which leads directly to. the virial 
theorem. The prescription is to use trial functions of the form 


$ C-s t, , -tv , — ? -f*) j ^ 


(XXI-33) 


and the proof is most easily given following the pattern centering 
around Eq. (21) . Thus iri'troducing the '§ as integration variables , 

one readily finds that in obvious notation ‘ 




=r O 


yields 






rv 

<2.-5 7 


^ ^ V /n % 


» (?- 


Multiplying through by V then yields (32) J 

Li. ^0-^ C'^) i 

d 6^ 

However • s ince H is invariant to a simultaneous change in sign of the 
'yy and of and since our set of trial functions has the same 

property (the transformation is equivalent to ) it follows 

from Sec. XIV that the t do not depend on the sign of ^ , and 

so we have the virial theorem. However unless the individual '4' 
don't depend on ' <R. at all, or involve 6^ only in conjunction 

with anoljher variational parameter (see below) such a set is in general 
not invariant. to positive scaling and is not invariant to changes in 
since neither separately is equivalent to a change in *5 
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Therefore although will satisfy the virial theorem it will not, 

in general, satisfy either (29) or (31) . Also it will not in general 

satisfy the axial force theorem without further variational flexibility^ 

Gn the other hand any set of the form'+’l 5 ) , since it 

is separately invariant to scaling of the .(equivalent to > 

, "V--> ) and changes in Q2. (equivalent to ) 

will yield which will satisfy (41) and (43) and (29). If further 

the variables occur in the combination ^ C?^— ^(5^))^ "b 

then one will also have invariance to translation along the internuclear 

^xis ( "f- o- equivalent to S ) and therefore the 

o— 

_ will also satisfy the axial force theorem. Indeed the two terms 
in (25) are each of this type, though in a slightly different notation, 
i.e. with ^ and jp^ instead of 'V <5^ 

An interesting specialization of (33) is provided by trial functions 


of the form 


^ C 


i. , li — , 

OL ’ (JL ■ ^ 




(XXI-34) 


Such a set then has the additional property that' for fixed values of 


it is independent of ^ . Therefore such a set will yield Vsj^ 's 

which satisfy the generalized Hellmann-Feynman theorem for in 

coordinates, and hence in any coordinates derived from the 
by- an ^ 'independent transformation, for example the often used 
orthogonal conf ocal elliptic coordinates (with the nuclei at the foci) . 

3 

We will now show that the generalized Hellmann-Feynman theorem, for 
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<3“'^ in such coordinates is precisely the virial theorem. Proof; 


In 


coordinates H takes the form 


t, 4.^ 

<P- 

where we have assumed that (R. is positive and where t and v are 

independent of ^ . Therefore the generalized Hellmann-Feynman 

3 

theorem in these coordinates yields (the factor of (3 in the voluifie 
element cancels out) 

Multiplying through by, 6 ^ we therefore have (32) again. 

^ V 6^ 


'Z-T 




■ O 


and is therefore independent of 


I *1 » vJ^ 

However in general \)-= .!= h-— 

rv ^ 

the sign of 0^ . Therefore since for fixed the set of 'V' 

'O'* •'"X/ 

is independent of the sign of equivalent to ) 

it follows . Sec. XIV that will depend only on R and hence we 

have thCyvirial theorem. Thus to satisfy the virial theorem it is 

^ 'Sy 

sufficient to use a' set of trial functions which in ^ type co- 
ordinates is independent of (R 

As another exercise in scaling we will state the following with- 
out detailed proof: Consider a general molecule in general coordinates, 

4 

and following Hurley, let us refer the nuclear configuration to a 
similar configuration according to 


<3 

ls^-=. 9 K 4 ^ same ' S- for all A" , S > 0 


(XXI-35) 


Using the pattern of Eq. (33) et seq. it is then easy to show that if 
we use trial functions of the form 
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'V- ^ (XXI-36) 

A 4 . 

the will satisfy Hurley’s^ form of the virial theorem 

A A 

2-T -V V ^ ^ ■ (xxi-37) 


XXII. ORTHOGONALITY AND RELATED THEOREMS 


Eigenfunctions belonging to different eigenvalues of H are 

automatically orthogonal. As we will see in a moment, it is. easy 

0 ^ 

enough to give sufficient conditions such that and will 

A /I 

be automatically orthogonal if . However it must be 

stated that so far, the only known way of realising these conditions 
a priori is to draw and 4%^ from a common linear space, 

»r 

and for this case we have already discussed the orthogonality properties 
A 

of the in Sec. XI. Also there are various other theorems of 

a similar type for which the same situation prevails. To put the 
matter another way .round, if the involve non linear parameters 

and/or functions as in UHF and OHF and Cl with non linear parameters, 
then we do not expect the theorems to be satisfied, and this is in 
agreement with experience. 

A 

To cover all the theorems at once let be an optimal trial 

function for a Hamiltonian , and let be an optimal trial 

function for a Hamiltonian H . Then suppose that among the vari- 

A. 

ations of which were possible in the set from which it was drawn, - 

was '4'\d where 6o- is a small but otherwise arbitrary complex 

number and ^ is a Herraitian operator. Then from (V-3)' applied 
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to we find, with obvious changes in notation 

C j -^4^) (XXII -1) 

. 'S A 

Similarly if 'Vt>. was a possible variation of Vu then from 

the complex conjugate of (V-3) applied to U— we find 




(XXII-2) 


Subtracting these then we have 


, C4i,^ CA 4v) h4-j\ 


(XXII-3) 


Various special cases are now of interest: 

(i) -A -?-\ * (,44^V^ . Then Eq. (3) tells us that if 

H'*— will be orthogonal to • However as we noted at 

the outset of the section, the one way we know to implement the suf- 
ficient condition in an ^ priori manner is co draw 4c- and 4v. 
from a common- linear space. 

A 

(ii) A4\- For Bc--^ Sus Eq. (3) is the vari- 

ational version of the so-called off diagonal hypervirial theorem 

for . The one way we know to implement the sufficient condition 

\ A A 

in an a priori manner is to draw and tu from a common linear 

space which is invariant to the action of However in such a 

case we can derive the theorem more directly as follows: 

( CA Ti - ) 4o-‘)= eo A 44 

But TT ^ ^ TTA ^ Therefore 

we can replace by H and we have the theorem. 
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(iii) VVcs. '=^V4-V • Eq. (3) is new the variational version 

2 

of the integral Hellmann-Feynman ■ theorem . ' The one way known to imple- 

■» ^ 

ment the sufficient condition in an a priori manner is to draw 
^ 3 

and from a common linear space; However in such a case we can 

derive the theorem directly as follows; 

( 

But IT '^6- ^ and "tr 'I^Va (note that under our assump- 
tion the same TT is involved in and ) so that we may 

replace ^ and \ 5 , hy ^ o— and thereby deriving the 

desired theorem. 

(iv) leave it to the reader to name 
and discuss this case. 
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appendix A: THE MAX-MIN THEOREM 

From [6] of Sec. II , we can characterize ’Ewi , the k'th 
smallest eigenvalue of H by 

where the are the eigenfunctions of H associated with the 

lower eigenvalues. That is one minimizes subject to the constraint 
that the ^ be orthogonal to the lower eigenfunctions. We now 
want to point out that there exists another variation approach, the 
so called "Max-Min Theorem", which does not require explicit informa- 
tion about lower states. Namely one can show that 

where the are k - 1 arbitrary functions. In words one first 

fN/ 

fixes the and determines the minimum of IS subject to the con- 

straint that be orthogonal to the . This minimum 

is then a functional of tlae ''^C • To find S-v^ one then maximizes 

with respect to che . We will now .give a brief proof that these 

two definitions of Sv< are equivalent. A more detailed proof with 
references and historical comment can be found in S. H. Gould, Varia- 
tional Methods for Eigenvalue Problems , Second Edition (Oxford, 1966) 
Sec. II. 6. 

We first note that whatever functions one chooses for the , 

they span a space which is at most k - 1 dimensional, and that therefore 
there is at least one linear combination of the which is 

orthogonal to all the and therefore is a suitable y 


for 
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(2). However for this 'V we have, writing it as ^ Au'V^ 
that - 5 ^ i A .A, 

g . = S. '">-' i ek 

c-r, -r; zi* wtr”" 

L-a'> 

Therefore, we have' the result that whatever functions are chosen for the 

the maximum of the minima in (2) cannot exceed Eh • ‘ On thei other 
hand if we choose the to be equal to the then 

from (1) it follows that the minimum for this choice of the is 

precisely ISvt , hence (2) follows. 

We' will now use (2) to give an elegant derivation of some of the 
results in Sec. VII. First we will derive the separation theorem * 

(VII-10) . Let ^ be the projection of H onto the M + 1 dimensional 
space spanned by the and ■ ^ . ■ Then 




S' 




where the are selected from the M + 1 dimensional space. 

Comparing and we see that the prescriptions 

are similar except that for is permitted to vary 

while in it is in effect fixed at . Thus the Max in 

the latter case can't be higher than in the former case and we have 

M A e^CkvO 
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Now let us compare and'^w,lif^) . As far as the are, 

concerned the prescriptions are the same. However in the latter case 
is more restricted so chat the Min can ' t be lower and we have 

• which completes the derivation of (VII-10) , 

Now following Perkins, J. Chem. Phys. 2156 (1966), we will 
derive the analog of (VII -12) for an arbitrary excited .state, (This 
paper also contains some numerical examples) If H is the projection 
of H onto the M + 1 dimensional space spanned by the then 

the two procedures can be characterized by 


and by 


l£x 







respectively. 

of H , we have 
A 


E 


Thus tjhether or not the 





are eigenfunctions 


However the are the eigenfunctions of H associated with 

the lower eigenvalues so we. also know that 

Ew ^ 

so that we have 

Eu ^ fe-} ^ 

which is the desired generalization of (VII-12) . 
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APPEHDIX B; LAGRANGE MDLTIPLXERS 


We wish to find the consequences of 
— 

§ e . 

in a situation in which the parameters and/or functions which .label 
satisfy certain equations of constraint 

^ ■ (Bl) 




To he specific, and since it is the most easily visualiged case, 


suppose that 


depends on M -real parameters 


A 


The direct approach is first to" use the equations (1) to extract an 
independent set of parameters, say in terms of which 

all the others may be expressed. Next one. writes ^ in terms of 
these independent parameters, and, denoting the result by 'S , 


calculates 




from 


=. Z 


Then since the C's-l-."' TV. • 


are aribtrary. 


5 ^ 


■a. B. 


•=-0 


(B2) 

yields 

CB3) 


as the equations to be solved. 


Another approach is the method of Lagrange -multipliers . Here the 
prescription is to first require 

. » A 5 . . » 


without , regard to the lack of independence of the 

C^) 

one solves 


(B4) 


; that is 
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■b&i iZi 


^,1 ^D ■ (» 5 ) 

where the V<^ , the Lagrange multipliers, are, for the moment, 

unknown parameters. The solutions of (5) will depend on the 
and the latter are then to be chosen- so that the equations (1) are 
satisfied. 

We now want to show that these two procedures are equivalent. 
The point is simply that if the equations (1) are satisfied so that • 

A (f\ 

we can use them to determine (X^ — — 0>. ^ in terms of/^- 

T^i 

(Xj — — (A If?. then the equations (1) will also imply, that 




T. iS., _ 






=--0 


Therefore, multiplying (5) by 
find 

^ V ^ A 

z: til 


C-1 (Bg) 

and summing over j , we 






(B7) 


^^3 7)£:i 

which clearly is the same as (3) since the left hand side is just 

with the understanding that we have used the constraints 




b^/ 

to express 0-^, ^ in terms of L, e., '■& 


A 

*Equations (1) and (5) are equivalent to S S- without con- 
straint where 'i-- ^ : variation of yields 

(5) while variation of o^. 


yields (1) . 
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FOOTNOTES 


Sec. II 


A. If one works in configuration space the existence of 
requires among other things that ^ be twice diffemtiable. 
However- this condition can be relaxed. If one uses 

instead of then. one can show that the results which we 

r\j 

will derive in this section will still hold even if is only 

once differentiable. (See E. Gourant and D. Hilb.ert Methods in 
Mathematical Physics bottom of page 457) . Also even if ^ 
is twice differentiable the V'P'J form is often more 

convenient numerically. However we will continue to use the 
expression (1) because it is much easier to deal with formally. 

B. In this and succeeding sections (and in the preceding footnote) 

we will use the language of molecular bound' state quantum mechanics. 
In particular we refer to H as the Hamiltonian having in mind 
that it is the internal Hamiltonian (or some approximation there- 
to) of a typical atom or molecule. However it should be kept in 
mind as we proceed that many of our formal results hinge only on 
H being a Hermitian operator - be it differential, integral or 
finite matrix - with a (at least partially) discreue spectrum. 

C. Since we have imposed no normalization requirements on and 

, the size of A is not a true measure of the difference 


between x 


and in that even if ^ is large they 


r\j 

may still be describing the same state - thus let 4-* ^ ~7 ^ 
An accurate measure provided by 


where 
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rv f' 

is that part of jO> which is orthogonal to x .It is then 
of some interest that with (6), Eq. (7) can be written as 

D. If H is a finite matrix (recall footnote B) or more generally 
if its spectrum is bounded from above -then clearly the largest 
eigenvalue is an absolute maximum of ^ 

Sec. Ill 

A. Another approach to getting a bound on the difference is to fix 
one member by experiment. However in this connection it should 
be kept in mind that H is almost certainly an approximate 
Hamiltonian and. therefore further corrections must be applied to 
the S (or to the experimental data) in order that the two 
numbers refer to the same physical (or mathematical) problem. 

Sec. VI 

A. In introducing the linear variation method in Sec. V we stated 
that the should be linearly independent. ’This require- 
ment has played no real role until now. However if the are 

not linearly independent there will be less than M independent 
equations in (V-12) and consequently the secular determinant will 
vanish identically. Thus only if the are linearly inde- 
pendent is (1) really an.^ equation for . In practice, 

particularly when using large non orthogonal basis sets , near linear 
dependence can often become a real numerical problem. 
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B. The authors of the first of references 1 refer to their method as 
the "method of moments". Unhappily this same name has also long 
been applied co a particular version of che linear variation method 
in which the are taken to have the form 

where in some given function. This particular choice of 

the however is of little use in molecular problems since 

the integrals may well not exist for L larger 

than 2 or so even with a very reasonable choice for , [In 

the paper by C-Y Hu, Phj's. Rev. 167 , 112 (1968) it might appear 
that the method of moments has been applied to the Helium Hamil- 
tonian. However a careful reading of the paper shows that it is 
actually being applied to a finite matrix approximation to that 
Hamiltonian]. A detailed discussion, with bibliography, of this 
method, can be found in the paper of J. B. Delos, S. M. Blinder, 

J. Chem. Phys. 47 , 2784 (1927). They also propose a method ' ; 

which is a computationally practical combination of 
the two methods of moments in -which the of Sec. V take 

the form of a S'-function multiplying (not multiplied by) various 
powers of H . They refer to it as the " V - method". It was 
also proposed independently together with a related method by 
E. Silverstone, M-L Yin, and R. L. Somarjai, J. Chem. Phys. 47 , 

4824 (1967) , and some illustrative calculations have been made by 
J. M. Rothstein, J. E. Welch, and H. J. Silverstone, J. Chem. Phys. 

2932 (1969). 
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Sec. VII 


A. If the linear space contains functions of various symmetries but 
in the form of a direct sum (that is the linear space can be de- 
composed into linear subspaces each having a definite symmetry) 

then since will commute with ^ so will if H does . 

A , 

Therefore the , as eigenfunctions of ^ will auto- 

matically have, or can be chosen to have, definite symmetry. In 

such a situation then the and ^«< we are talking about 

(> ^ 

in this section are the successive and '=■'< of a given 


symmetry. Finally if H commutes with 




but 


does not. 


so that in general the won’t have definite symmetry, then 

all we will be able to say from the results of this section is 


that the successive '2:'^ are upper bounds to the successive 
eigenvalues of H ordered without regard to symmetry. 

B. If H has only bound state eigenvalues (of appropriate 

symmetry) then for the will be able to conclude 

only that they are all upper bounds to the highest bound state 
of H . However in what follows we will not consider this possi- 
bility explicitly. Also we will not worry about such interesting 
things as bound states and quasi bound states imbedded in continua 
of the same symmetry. For a recent review of the application of 
the linear variation method to such situations see H. S. Taylor, 
Advances in Chemical Physics 18 (1970), I. Priggogine and S. A. Rice 
(ed.) (Interscience New York) . Also A. TJ-. Haai and H. S. Taylor, 


Phys. Rev. 1109 (1970). 
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C. If H is a finite matrix, or more generally an operator whose 
highest eigenvalues are all discrete and bounded from above, 
then one can clearly prove a theorem analogous to that of (6) , 

Sec . II but- with minimum replaced by maximum and smaller replaced 

by larger (recall also footnote D, Sec. II). Correspondingly, 

rw A 

since S in (1) is also larger than where U'' is the 

smallest ^ value for which , one can show that 

if H has dimension -then 

Sec.. VIII 

A. For nuclei the Pauli principle plays a large role in validating 

the independent particle picture. See for example V. F. Weisskopf, 
Physics Today, July 1961, page 18. 

■ Sec. IX 

A. A- point of notation. In our general discussion the S 3 nabol 

( , ) has denoted a scalar product in the N-particle 

space. In the first sum in (5) however we use the same symbol 
for a scalar product in a one particle space and in the second 
sum it is used for a scalar product in a two particle space. This 
should cause no confusion if one keeps in mind always the nature 
of the operators and functions involved. For example Eq. (7) 
below contains ) which is a scalar product over the 

variables of but which is still a function of the remaining 

variables in g . 

B. These other sets are readily shown to satisfy (12) with the 

the appropriate unitary transformation of the Si . The- essential 
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point is that since £ involves rhe spin orbicals only in the 
form of a "scalar product" (v) ^ invariant to such 

a transformation of spin orbitals. For a detailed discussion see 
C. C. J. Roothaan, Rev. Mod. Phys. 23, 69 (1951). 

C. However even with these one-dimensional equations the fact that 
the exchange terms involve an integral operator rather than being 
a local potential often makes calculation difficult. Therefore 
there has been considerable investigation and use of local approxi- 
mations to the exchange terms. The original suggestion was due • 
to J . C. Slater, Phys. Rev. 81 , 385 (1951). For recent discussions 
see T. M. Wilson, J. H. Wood, and J. G. Slater, Phys. Rev. A2 , 

620 (1970), and J. C; Slater and J. H. Wood, Int. J. Q, Chem. S4 , 

3 (1971). 


D. The result which we have just proven is actually only a corrollary 
of what is really Koopman's theorem: Let the he some 

orthonormal set of UHP spin orbitals, not necessarily the canonical 
set. Then we delete one of them to yield a trial function for 

A 

the N - 1 particle system. We now fix the i.e. fix the 

unitary transformation which relates them to the canonical spin 
orbitals, by requiring that they be such as to make the energy 
of the N - 1 particle system stationary. Since the energy of the 
N particle system is fixed this then means A E stationary and 
clearly we still have 
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since in the discussion in the text we used the canonical nature 
of the spin orbitals only in the final seep. We now require that 
. Since §-^“^9 (see footnote B above) this 

means 

In addition we have constrained to be normalized so we 

also have 

^ r=s-v 

and thus we are led to 

That is (and this is really Koopman's theorem) the optimal Yh 
to remove when one uses trial functions of this sort for the N-1 
particle system is precisely the canonical one. An analogous 
theorem for excitation of the N particle system has been dis- 
cussed by W, J. Hunt and W. A. Goddard III, Ghem. Phys. Lett. 3^, 
414 (1969). 

E. The frozen spin orbital wave function would seem more appropriate 
to describe the N-1 particle system immediately after the sudden 
ionization of the N particle system (assuming that UHF gives 
. an adequate description of the N particle system) . For a dis- 
cussion of the implications of £^£ — 6%^ from this point of view 
see R. Manne and T. Aberg, Ghem. Phys. Lett. 7 _, 282 (1970). 

Sec. X 


A. In the case of ground states there is a large literature discussing 
the question of whether or not such solutions » though they exist. 
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represent an absolute minimum or indeed even a local minimum with- 
in UHF. For a recent discussion and an extensive list of references 
see for example J. Pauldus and J. Cizek, Phys . Rev. 2268 (1970). 
See also J. I, Musher, Chem. Phys. Lett. _7, 397 (1970). 

Sec. XI 

A. With this choice of ^ • Note however that 

quite generally, whatever and that 

Cv5'-rit!T^)-=. ^ , j'or some cases in which 

^ rather naturally equals ^ see K. H. Hansen, Theoret. 
Chim. Acta _6, 87 (1966); G. Glieraann, Theoret. Ghim, Acta 11, 

75 (1968), and references therein. 

B. For some special properties of the choice G—TT) 

see W. H. Adams, J. Chem. Phys. ^5, 3422 (1966), Note however 

that except for the , the eigenfunctions of this operator 

are by no means obvious . With the choice all the 

©■(A which contribute to have «“^herefore 

completeness immediately yields the simple result 

4-': c u -fe 4^ 

tt< 

Sec. XII 

A. An interesting question which we won't pursue: If the set 

"almost" satisfies' the sufficient condition, then how nearly can 

A 

one expect the 'r to have the desired property? 
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Sec. XIII 


It would be sufficient for our purposes if the was only 

"effectively” invariant to complex conjugation. That is if there 
were some constant A , depending perhaps on , such that 

^ is. Since is physically 

the use of the word "effectively" is clearly 




At is in the set if 


equivalent to 
justified. 

Also similar remarks could be made in later sections where 
we will assume various other invariances in order to derive other 
theorems. However the distinction between invariant and effectively 
invariant is anyway overly pedantic since clearly, without changing 
the results of the variational calculation in any way we can change 
an effectively invariant set into an invariant one simply by making 
the overall scale of the trial functions arbitrary. Thus here and 
in the sections which follow we will, to. simplify the presentation, 
require invariance to various operations , though effective invariance 
would suffice. 

Sec. XVI 


A. Some discussion and references to the early history of this theorem 
can be found in S. T. Epstein, Am. J. Phys, 613 (1954), and in 
J. I. Musher, Am. J. Phys. 267 (1966). We will discuss the 
Hellmann-Feynman theorem in footnote D, Sec. XXI. 
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In spite of our comments in the text about the simplicity and 
sufficiency of Hurley's theorem, the following derivation is 
of interest for the linear case. Since is an eigen- 

function of we have 


Z(T- 



But if the set is invariant to changes in 



Therefore 


<T“ 


bo- 




then 
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Sec. XVIII 

A. S. T. Epstein and J. 0. Hirschfelder , Phys. Rev. 123 , 1495 (1967) 
discuss the use of trial functions of the form C'’ more 

general 

Sec. XIX 

A. By average force on the nuclei we mean the average, using 

of the familiar classical expression. See footnote D, Sec. XXI 
for some comments on this definition in the case of molecules. 


Sec. XXI 

A. sy is the trace of the "tensor virial operator" 

I, - ■'•-I 

We have seen that by "isotropic scaling", i.e. by scaling all 
components of the equally we can satisfy the hypervirlal 

theorem for . Similarly by "anisotropic scaling" (scaling 

each component of separately one can guarantee the hyper- 

virial theorem for the diagonal elements of (for an appli- 

cation see W. J. Meath and.J. 0. Hirschfelder, J. Chem. Phys. 39 , 
1135 (1963)). Finally by tensor scaling (D. Pandres, Phys. Rev. 
131 , 886 (1963)) one can simultaneously guarantee the theorem for 
all components of 'i-j . However, as always, symmetry alone may 
be enough to guarantee some of these theorems (or to guarantee 
some, given others). 

B. For a polyatomic molecule the ^i.C) and the bond angles are not 

C 

all independent. Thus c: may be written in various ways as 
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functions of these quantities. In (28) the sum runs over those 
particular bond lengths* independent or not, which one has chosen 
to use in expressing & 

C. If one does not use a fixed nucleus approximation then it is easy 

tp show that if the internal Hamiltonian is written in terms of 
internal Cartesian coordinates and their canonically con- 

jugate momenta 'O'* , then the hypervirial theorem for 

A- A. 

is the statement that 2T + V = 0 where T now is the average 
total (electronic and nuclear) internal kinetic energy and where 
V is the corresponding average potential energy. This theorem 
can be ensured by using a set of M' which is invariant to 
scaling of the 

D. Equation (26) is the variational versioh of the Hellmann-Feynman 
theorem; H. Hellmann, Einfuhrlng in die Quanten Chemle (Franz 
Denticke; Leipzig, 1937) p. 285, and R. P. Feynman, Phys. Rev. 56 , 

340 (1937). In words it says that, as classically, one can 
calculate the force on a nucleus by calculating the negative 
gradient of the energy with respect to nuclear coordinates . Actually 
the theorem is often read the other way around. That is, because 

of the use one makes of C in the Born-Oppenheimer approximation, 
the average force on nucleus 1 say is^a priori^taken to be ^ 

The theorem then says that it can also be calculated from the 
electronic charge density as the average value of the classical 
force operator Fj . .(See for example P. Pulay, Mol. Phys., 17 , 

197 (1969) especially Sec. 3). Moreover the theorem often appears 
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in other forms, forms which argali equivalent if various theorems 
are satisfied (and if is an eigenfunction they all are 

satisfied) . Thus consider a diatomic molecule. Then if the force 
theorems are satisfied we can write the force on nucleus one more 
generally as 




where ^ and ^ are arbitrary numbers except that 0.-1, = 
and various choices have been used in the literature. Also if, 

A 

I ■! ^ 

as is usually the case, the are purely axial one needs only 

the axial component of this quantity. 

E. The atomic orbitals in (18) are centered on the nuclei and are 
invariant to rotation about the internuclear axis. For reasons 

of symmetry one might expect the latter also to be true of the 

/I 

optimal orbitals derived from (19), i.e. that the points ^ 
and ^ will be on the internuclear axis. However there is 


no reason to -expect that^'^ 

A 




will equal 


1“ will equal , and that 

. That is, as a price one pays for 


translational invariance (and hence the force theorem) with such 
/V 


a simple set of ^ the atomic orbitals will have their cusps 
■off the nuclei. Following Hurley (ref. 7) t' like (19) are 
often called "floating wave functions" . Eigenfunctions for this 
problem, of course, have cusps at the nuclei. 

II I! 

P. 0. Lowdin, J. Mol. Spec. 46 (1959). Lowdin does not mention 
any particular coordinate system. However he assumes, without 
comment, that V depends only on ^ and not on or 


separately. 
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Sec. XXII 

A. W® have already discussed orthogonality to some extent at the end 
of Sec. VII. In some SCF calculations, orthogonality, chough not 
exact, can however be very nearly realized. See for example 

P. Bagus, Phys. Rev. 139 , A619 (1965). See also M. Cohen and 
A. Dalgarno, Rev. Mod. Phys. 35 , 506 (1963). For a discussion of 
the situation for off diagonal hypervirial theorems see the ref- 
erences and discussions given in ref. 1. For the integrated 
Hellmann-Feynman theorem see S. T. Epstein, A. C. Hurley, R. E. 
Wyatt, aijd R. G. Parr, J. Chem. Phys, 47 , 1275 (1967) and refer- 
ences cited there. 

B. The way this is often done in practice is as follows. Let 

be a function derived in some way, usually by a variational 
calculation for mu3* Then one does new linear variational 

calculations for and Vivo , in each case using and 

as the basis set. If one wishes to satisfy a sequence of integral 
Hellmann-Feynman theorems involving a sequence of Hamiltonians 

, then one does a new linear variational calculation 
for each using the same set of as the basis. In 

particular (A. C. Hurley, Int. J. Q. Chem. 3^, 677 (1967)) if the 
sequence is continuous the linear variational method leads 

to homogeneous integral equations instead of algebraic equations . 



